Documentation read from 04/17/2019 22:07:26 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_find_clusters_relevant_to_reaction.

svr_find_clusters_relevant_to_reaction

svr_find_clusters_relevant_to_reaction

Find clusters potentially relevant to a search for a "missing gene"

------

Example:

    svr_find_clusters_relevant_to_reaction -g 83333.1 < unconnected.reactions

would take as input a file containing reacions that are believed to be present in the genome 83333.1, but cannot yet be connected to specific genes.

------

The standard input should be a tab-separated table (i.e., each line is a tab-separated set of fields). Normally, the last field in each line would contain a role for which relevant clusters are desired. If some other column contains the roles, use

    -c N

where N is the column (from 1) that contains the reaction in each case.

This is a pipe command. The input is taken from the standard input, and the output is to the standard output.

Command-Line Options

-c Column

This is used only if the column containing reactions is not the last.

-g Genome

This normally specifies the genome for which relevant clusters are sought. If it is an integer, then each line of input is thought of as a genome-role pair, and the integer specifies the column in each input line that will contain the genome. The -c parameter is the column that will be used as a role.

-d MaxSteps [default is 1]

This parameter gives the maximum number of steps (i.e., the "radius") the program can take to create the neighborhood of a reaction

-i inputFile

If specified, the name of the input file; otherwise, the input will be taken from STDIN.

Output Format

The standard output is a tab-delimited file. It consists of the input file with an extra column added. The extra column will contain a list of two or more comma-separated PEGs. There may be more than one output line for a single input (if multiple clusters are detected within the genome).