Documentation read from 04/17/2019 22:07:27 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_oligomer_similarity.

svr_oligomer_similarity [-min=] [-max=] < ali.fasta > count-matricies

svr_oligomer_similarity [-min=] [-max=] < ali.fasta > count-matricies

This command goes through an alignment and computes the pairwise fractions of n-character identities, producing a matrix of values for each string length specified in the -min to -max parameters.

This tool is used to produce estimates of similarity based on the fraction of positions in which n-character perfect matches occur between each pair of sequences.

The output is a set of matricies written to STDOUT. The format of each matrix is

N Id1 Frac1-1 Frac1-2 Frac1-3... Id2 Frac2-1 Frac2-2 Frac2-3... . . . // . . .

------ Example: svr_oligomer_similarity -min=2 -max=4 < seqs.fasta > matricies

would produce a set of matricies summarizing the matches of different matricies ------

Command-Line Options

-min=M

minimum size of character-strings that match (defaults to 2)

-max=N

maximum size of character-strings that match (defaults to 2)

Output Format

The standard output is a file of matricies. Each matrix is composed of a line with the size-of-matches, a set of lines (one line per input sequence) that give fractions of positions that have identical matches, and a terminating '//'.