Documentation read from 04/17/2019 22:07:25 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_assign_to_dna_using_figfams.
svr_assign_to_dna_using_figfams <feature_list.fasta >functions.tbl
Assign Using the FIGfams Server
This script takes a FASTA file of DNA contigs from the standard input and writes the predicted function of each (if it can be estimated) to the standard output. FIGfams are used to determine the function when possible. When not possible, a message will be written to the standard error output.
This script is substantially different from svr_assign_using_figfams.pl in that each incoming sequence should be considered as contigs in which results can be found rather than a single sequence whose function is desired. As a result, the output will not correspond well to the input. Some sequences will get many hits, some will have only one, and some may not have any.
Size of the kmers used to detect similarity (defaults to 8)
A number, generally from 1 to 100, indicating how careful we should be about making the assignments. A higher number indicates greater care. Defaults to 5.
When looking for a match, if two sequence elements match and are closer than this distance, then they will be considered part of a single match. Otherwise, the match will be split.
When looking for a match, we group together a set that "covers" some region. The set is not necessarily in a single frame (i.e., we treat the sequence as low quality and only consider the number of hits in a region on the same strand). This parameter forces the size of the region to be above a specified value. The default is 6 * the size of the kmers (the 'kmer' parameter).
Display the hits ordered by location.
Display this command's parameters and options.
The standard output is a tab-delimited file. Each output record consists of the ID of the query sequence, the number of matching kmers, a location in the sequence, the predicted function, and an organism name that represents an OTU category.