Documentation read from 04/17/2019 22:07:25 version of /vol/public-pseed/FIGdisk/FIG/bin/svr_blast.

svr_blast

svr_blast

Run blast locally

------ Example: svr_blast -p pegs 83333.1 [ blast PEGs identified in file against genome 83333.1 ] svr_blast -d pegs 83333.1 [ use blastn, not blastp ] svr_blast -s pegs 83333.1 [ blast PEGs in fasta file against genome 83333.1 ] svr_blast -p pegs [ blast PEGs identified in file against themselves ] svr_blast 83333.1 [ sequences of PEGs from the last column of STDIN input against genome] svr_blast [ sequences of PEGs from the last column of STDIN input against themselves ] svr_blast -c 1 [ sequences of PEGs from the first column of STDIN input against themselves ] svr_blast -c 1 -parms='-m8' [ sequences of PEGs from the first column of STDIN input against themselves - -m8 format ]

    The output is exactly the unfiltered blast output

------

This svr command may be thought of as implementing two types of requests:

    1.  "Blast a set of PEGs against the genes (or protein products) in a set of genomes"
    2.  "Blast a set of PEGs against itself".

When we say "set of pegs" or "pegs in genome" we mean either the DNA or the protein sequences corresponding to the pegs. Which is determined by the -d flag or its absence (think of protein by default, -d for DNA is that is what you want).

A set of PEGs can be read from a file. If the file contains just IDs, use "-p IDfile". If the file contains actual sequence in FASTA format use "-s fasta.file".

If you are blasting PEGs against genomes, the genomes are given as one or more arguments of the form xxx.yyy (where xxx.yyy is the genome ID; for example, E.coli is 83333.1).

You can read the PEG ids from standard input, much like most of the other SVR scripts (this is done only if -s File and -p File were omitted). IDs are from the last column in the STDIN file, or from another column specified using the -c argument. The standard input should be a tab-separated table (i.e., each line is a tab-separated set of fields). Normally, the last field in each line would contain the PEG for which aliases are being requested. If some other column contains the PEGs, use

    -c N

where N is the column (from 1) that contains the PEG in each case.

NOTE: the PEG sequences are formed as the union of the sequences derived from

    1. the IDs from STDIN (only if -p and -s are omitted)
    2. the ids from the -p file
    3. the sequences from the -s file

This is a pipe command. The input is taken from the standard input, and the output is to the standard output.

The parameters of the BLAST run are the defaults, unless you use

    -parms='parameters passed to blast'

Command-Line Options

-c Column

This is used only if the column containing PEGs is not the last.

Output Format

The output is just the BLAST output.