Documentation read from 04/17/2019 22:07:24 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_NCBI_taxonomy.

svr_NCBI_taxonomy

svr_NCBI_taxonomy

Get taxonomy information from NCBI

Usage

    svr_NCBI_taxonomy  [options]  taxid ...        > tab_separated_data

    svr_NCBI_taxonomy  [options]  < file_of_taxids > tab_separated_data

Command-Line Options

General Options

- --

Marks the end of the flags.

-x --Id

Include taxid at start of each output line (always true for multiple taxa)

Options that Select Returned Data

If multiple options are specified, each value is prefixed with its key. Items are reported in the order requested.

-. --All

All available data from the set of options listed below.

-a --LineageAbbrev

Abbreviated lineage as semicolon separated names. This might not be quite what you expect when the taxon is a division finer than a species because it does not taxonomy may not include the name. See -A (--LineageAbbrevPlus) below.

-A --LineageAbbrevPlus

Abbreviated linage plus any suffix of additional terms present in the full lineage as semicolon separated names (D).

-c --CommonName

Common name

-d --Division

Name of the GenBank division. This is the full work, not the 3 letter abbreviation used in the GenBank entry.

-f --Lineage

Full lineage as semicolon separated names.

-g --GeneticCode

Genetic code number.

-i --LineageAbbrevIds

Abbreviated lineage as tab separated taxids.

-I --LineageAbbrevPlusIds

Abbreviated lineage plus full lineage suffix as tab separated taxids.

-l --LineageIds

Full lineage as tab separated taxids.

-m --MitochondrialGeneticCode

Mitochondrial genetic code number.

-n --LineageNames

Full lineage as tab separated names.

-p --Parent

Parent taxid.

-r --Rank

Taxonomic rank

-s --ScientificName

Scientific name

-t --LineageAbbrevNames

Abbreviated lineage as tab separated names.

-T --LineageAbbrevPlusNames

Abbreviated lineage plus full lineage suffix as tab separated names.

Summary of lineage type and format flags:

    -------------------------------------------------
                                   Lineage
                           --------------------------
    Format                 Full    Abbrev  AbbrevPlus
    -------------------------------------------------
    name; name; ...         -f       -a       -A
    name \t name \t ...     -n       -t       -T       
    taxid \t taxid \t ...   -l       -i       -I
    -------------------------------------------------

Output Format

The output is one or more tab-delimited fields.

If the -x flag is included, or more than one taxon_id is specified, the taxon_id is the first column.

If more than one data type is requested, the next column is the keyword for the data on the line.

The requested data follow.

Examples

Scientific Name:

 svr_NCBI_taxonomy -s 83333
 Escherichia coli K-12

Comparison of the NCBI abbreviated lineage (-a) to the abbreviated lineage plus suffix from full lineage (-A); note the addition of the species binomial from the full lineage (-f):

 svr_NCBI_taxonomy -saAf 83333
 ScientificName Escherichia coli K-12
 LineageAbbrev  Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia
 LineageAbbrevPlus      Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia; Escherichia coli
 Lineage        cellular organisms; Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia; Escherichia coli

Multiple taxids:

 svr_NCBI_taxonomy -s 83333 83334
 83333  Escherichia coli K-12
 83334  Escherichia coli O157:H7

If I can do multiple taxids, why not the whole lineage (evil way to get a nice listing)?

 svr_NCBI_taxonomy -s `svr_NCBI_taxonomy -l 83333` 83333
 131567 cellular organisms
 2      Bacteria
 1224   Proteobacteria
 1236   Gammaproteobacteria
 91347  Enterobacteriales
 543    Enterobacteriaceae
 561    Escherichia
 562    Escherichia coli
 83333  Escherichia coli K-12

Everything we can get:

 svr_NCBI_taxonomy --All 83333
 Division       Bacteria
 GeneticCode    11
 Lineage        cellular organisms; Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia; Escherichia coli
 LineageIds     131567  2       1224    1236    91347   543     561     562
 LineageNames   cellular organisms      Bacteria        Proteobacteria  Gammaproteobacteria     Enterobacteriales       Enterobacteriaceae      Escherichia     Escherichia coli
 LineageAbbrev  Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia
 LineageAbbrevIds       2       1224    1236    91347   543     561
 LineageAbbrevNames     Bacteria        Proteobacteria  Gammaproteobacteria     Enterobacteriales       Enterobacteriaceae      Escherichia
 LineageAbbrevPlus      Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia; Escherichia coli
 LineageAbbrevPlusIds   2       1224    1236    91347   543     561     562
 LineageAbbrevPlusNames Bacteria        Proteobacteria  Gammaproteobacteria     Enterobacteriales       Enterobacteriaceae      Escherichia     Escherichia coli
 Parent 562
 Rank   no rank
 ScientificName Escherichia coli K-12