Documentation read from 04/17/2019 22:07:27 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_protein_assertions.
svr_protein_assertions <gene_ids.tbl >assertion_data.tbl
Get a list of Annotation Clearinghouse assertions for the specified proteins.
The standard input should be a tab-delimited file with IDs in the last column. The IDs should be prefixed protein or gene IDs (e.g. uni|AYQ44
, fig|360108.3.peg.1041
, md5|4a+6lQzFY8hRkQyWPliFjw
). For each of these identifiers, this script will search for an identifier in the Annotation Clearinghouse with an identical protein sequence that has an associated functional assignment. For that identifier, the following fields will be returned.
This is a pipe command. The input is taken from the standard input, and the output is to the standard output.
1
if we believe the identifier corresponds to the exact gene identified by the input identifier, else 0
. If the input identifier does not specify a particular gene, this column will always be 0
.1
if the assignment is considered expert, else 0
.The net effect is that for each identifier, we find the assignments for protein-equivalent identifiers in the annotation clearinghouse. Because there are many identifiers that produce the same protein sequence, each input line will generate multiple output lines.
The URL for the Annotation Clearinghouse server, if it is to be different from the default.