Documentation read from 04/17/2019 22:07:26 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_gene_data.
svr_gene_data fld1 fld2 ... fldN <gene_ids.tbl >gene_data.tbl
Get one or more pieces of data about each specified gene.
This script takes as input a tab-delimited file with gene IDs at the end of each line. For each gene ID, one or more selected data items are appended to each line.
This is a pipe command: the input is taken from the standard input and the output is to the standard output.
The data items are specified as positional parameters on the command line, and are appended in the order specified to the output lines. The permissible data items are as follows.
If a single identifier refers to multiple genes, there will be one output line for each gene.
Comma-delimited list of evidence codes indicating the reason for the gene's current assignment.
The FIG ID of the gene.
Current functional assignment.
Name of the genome containing the gene.
Number of base pairs in the gene.
Comma-delimited list of location strings indicated the location of the gene in the genome. A location string consists of a contig ID, an underscore, the starting offset, the strand (+
or -
), and the number of base pairs.
Comma-delimited list of PUBMED IDs for publications relating to the gene.
Database source of the IDs specified-- SEED
for FIG IDs, GENE
for standard gene identifiers, or LocusTag
for locus tags. In addition, you may specify RefSeq
, CMR
, NCBI
, Trembl
, or UniProt
for IDs from those databases. Use mixed
to allow mixed ID types (though this may cause problems when the same ID has different meanings in different databases). Use prefixed
to allow IDs with prefixing indicating the ID type (e.g. uni|P00934
for a UniProt ID, gi|135813
for an NCBI identifier, and so forth). The default is SEED
.
The URL for the Sapling server, if it is to be different from the default.
Column index. If specified, indicates that the input IDs should be taken from the indicated column instead of the last column. The first column is column 1.