Documentation read from 01/30/2020 17:36:20 version of /vol/public-pseed/FIGdisk/dist/releases/cvs.1555556707/common/lib/FigKernelPackages/ServerThing.pm.

General Server Helper

General Server Helper

This package provides a method-- RunServer-- that can be called from a CGI script to perform the duties of a FIG server. RunServer is called with two parameters: the name of the server package (e.g. SAP for SAP.pm) and the first command-line parameter. The command-line parameter (if defined) will be used as the tracing key, and also indicates that the script is being invoked from the command line rather than over the web.

RunRabbitMQClient

This routine sets itself up as a FCGI listener for incoming FCGI requests (like RunServer), but instead of processing the requests forwards them to the RabbitMQ message broker. For each request, we set up an ephemeral response queue for handling the response to the message.

Note that we don't touch the message bodies; they are only decoded on the actual messaging processing node.

RunRabbitMQClientAsync($server_name, $config)

Run the asynchronous FCGI gateway server.

RunRabbitMQServer

This is the agent code that listens on a queue for incoming requests to process data. We run one of these processes for every core we want to do active processing.

Server Utility Methods

The methods in this section are utilities of general use to the various server modules.

AddSubsystemFilter

    ServerThing::AddSubsystemFilter(\$filter, $args, $roles);

Add subsystem filtering information to the specified query filter clause based on data in the argument hash. The argument hash will be checked for the -usable parameter, which includes or excludes unusuable subsystems, the -exclude parameter, which lists types of subsystems that should be excluded, and the -aux parameter, which filters on auxiliary roles.

filter

Reference to the current filter string. If additional filtering is required, this string will be updated.

args

Reference to the parameter hash for the current server call. This hash will be examined for the -usable and -exclude parameters.

roles

If TRUE, role filtering will be applied. In this case, the default action is to exclude auxiliary roles unless -aux is TRUE.

GetIdList

    my $ids = ServerThing::GetIdList($name => $args, $optional);

Get a named list of IDs from an argument structure. If the IDs are missing, or are not a list, an error will occur.

name

Name of the argument structure member that should contain the ID list.

args

Argument structure from which the ID list is to be extracted.

optional (optional)

If TRUE, then a missing value will not generate an error. Instead, an empty list will be returned. The default is FALSE.

RETURN

Returns a reference to a list of IDs taken from the argument structure.

RunTool

    ServerThing::RunTool($name => $cmd);

Run a command-line tool. A non-zero return value from the tool will cause a fatal error, and the tool's error log will be traced.

name

Name to give to the tool in the error output.

cmd

Command to use for running the tool. This should be the complete command line. The command should not contain any fancy piping, though it may redirect the standard input and output. The command will be modified by this method to redirect the error output to a temporary file.

ReadCountVector

    my $vector = ServerThing::ReadCountVector($qh, $field, $rawFlag);

Extract a count vector from a query. The query can contain zero or more results, and the vectors in the specified result field of the query must be concatenated together in order. This method is optimized for the case (expected to be most common) where there is only one result.

qh

Handle for the query from which results are to be extracted.

field

Name of the field containing the count vectors.

rawFlag

TRUE if the vector is to be returned as a raw string, FALSE if it is to be returned as reference to a list of numbers.

RETURN

Returns the desired vector, either encoded as a string or as a reference to a list of numbers.

ChangeDB

    ServerThing::ChangeDB($thing, $newDbName);

Change the sapling database used by this server. The old database will be closed and a new one attached.

newDbName

Name of the new Sapling database on which this server should operate. If omitted, the default database will be used.

Gene Correspondence File Methods

These methods relate to gene correspondence files, which are generated by the svr_corresponding_genes.pl script. Correspondence files are cached in the organism cache ($FIG_Config::orgCache) directory. Eventually they will be copied into the organism directories themselves. At that point, the code below will be modified to check the organism directories first and use the cache directory if no file is found there.

A gene correspondence file contains correspondences from a source genome to a target genome. Most such correspondences are bidirectional best hits. A unidirectional best hit may exist from the source genome to the target genome or in the reverse direction from the targtet genome to the source genome. The cache directory itself is divided into subdirectories by organism. The subdirectory has the source genome name and the files themselves are named by the target genome.

Some of the files are invalid and will be erased when they are found. A file is considered invalid if it has a non-numeric value in a numeric column or if it does not have any unidirectional hits from the target genome to the source genome.

The process of managing the correspondence files is tricky and dangerous because of the possibility of race conditions. It can take several minutes to generate a file, and if two processes try to generate the same file at the same time we need to make sure they don't step on each other.

In stored files, the source genome ID is always lexically lower than the target genome ID. If a correspondence in the reverse direction is desired, the converse file is found and the contents flipped automatically as they are read. So, the correspondence from 360108.3 to 100226.1 would be found in a file with the name 360108.3 in the directory for 100226.1. Since this file actually has 100226.1 as the source and 360108.3 as the target, the columns are re-ordered and the arrows reversed before the file contents are passed to the caller.

Gene Correspondence List

A gene correspondence file contains 18 columns. These are usually packaged as a reference to list of lists. Each sub-list has the following format.

0

The ID of a PEG in genome 1.

1

The ID of a PEG in genome 2 that is our best estimate of a "corresponding gene".

2

Count of the number of pairs of matching genes were found in the context.

3

Pairs of corresponding genes from the contexts.

4

The function of the gene in genome 1.

5

The function of the gene in genome 2.

6

Comma-separated list of aliases for the gene in genome 1 (any protein with an identical sequence is considered an alias, whether or not it is actually the name of the same gene in the same genome).

7

Comma-separated list of aliases for the gene in genome 2 (any protein with an identical sequence is considered an alias, whether or not it is actually the name of the same gene in the same genome).

8

Bi-directional best hits will contain "<=>" in this column; otherwise, "->" will appear.

9

Percent identity over the region of the detected match.

10

The P-score for the detected match.

11

Beginning match coordinate in the protein encoded by the gene in genome 1.

12

Ending match coordinate in the protein encoded by the gene in genome 1.

13

Length of the protein encoded by the gene in genome 1.

14

Beginning match coordinate in the protein encoded by the gene in genome 2.

15

Ending match coordinate in the protein encoded by the gene in genome 2.

16

Length of the protein encoded by the gene in genome 2.

17

Bit score for the match. Divide by the length of the longer PEG to get what we often refer to as a "normalized bit score".

18 (optional)

Clear-correspondence indicator. If present, will be 1 if the correspondence is a clear bidirectional best hit (no similar candidates) and 0 otherwise.

In the actual files, there will also be reverse correspondences indicated by a back-arrow ("<-") in item (8). The output returned by the servers, however, is filtered so that only forward correspondences occur. If a converse file is used, the columns are re-ordered and the arrows reversed so that it looks correct.

CheckForGeneCorrespondenceFile

    my ($fileName, $converse) = ServerThing::CheckForGeneCorrespondenceFile($genome1, $genome2);

Try to find a gene correspondence file for the specified genome pairing. If the file exists, its name and an indication of whether or not it is in the correct direction will be returned.

genome1

Source genome for the desired correspondence.

genome2

Target genome for the desired correspondence.

RETURN

Returns a two-element list. The first element is the name of the file containing the correspondence, or undef if the file does not exist. The second element is TRUE if the correspondence would be forward or FALSE if the file needs to be flipped.

ComputeCorrespondenceFileName

    my ($fileName, $genomeA, $genomeB) = ServerThing::ComputeCorrespondenceFileName($genome1, $genome2);

Compute the name to be given to a genome correspondence file in the organism cache and return the source and target genomes that would be in it.

genome1

Source genome for the desired correspondence.

genome2

Target genome for the desired correspondence.

RETURN

Returns a three-element list. The first element is the name of the file to contain the correspondence, the second element is the name of the genome that would act as the source genome in the file, and the third element is the name of the genome that would act as the target genome in the file.

ComputeCorresopndenceDirectory

    my $dirName = ServerThing::ComputeCorrespondenceDirectory($genome);

Return the name of the directory that would contain the correspondence files for the specified genome.

genome

ID of the genome whose correspondence file directory is desired.

RETURN

Returns the name of the directory of interest.

CreateGeneCorrespondenceFile

    my ($fileName, $converse) = ServerThing::CheckForGeneCorrespondenceFile($genome1, $genome2);

Create a new gene correspondence file in the organism cache for the specified genome correspondence. The name of the new file will be returned along with an indicator of whether or not it is in the correct direction.

genome1

Source genome for the desired correspondence.

genome2

Target genome for the desired correspondence.

RETURN

Returns a two-element list. The first element is the name of the file containing the correspondence, or undef if an error occurred. The second element is TRUE if the correspondence would be forward or FALSE if the file needs to be flipped.

MustFlipGenomeIDs

    my $converse = ServerThing::MustFlipGenomeIDs($genome1, $genome2);

Return TRUE if the specified genome IDs are out of order. When genome IDs are out of order, they are stored in the converse order in correspondence files on the server. This is a simple method that allows the caller to check for the need to flip.

genome1

ID of the proposed source genome.

genome2

ID of the proposed target genome.

RETURN

Returns TRUE if the first genome would be stored on the server as a target, FALSE if it would be stored as a source.

ReadGeneCorrespondenceFile

    my $list = ServerThing::ReadGeneCorrespondenceFile($fileName, $converse, $all);

Return the contents of the specified gene correspondence file in the form of a list of lists, with backward correspondences filtered out. If the file is for the converse of the desired correspondence, the columns will be reordered automatically so that it looks as if the file were designed for the proper direction.

fileName

The name of the gene correspondence file to read.

converse (optional)

TRUE if the file is for the converse of the desired correspondence, else FALSE. If TRUE, the file columns will be reorderd automatically. The default is FALSE, meaning we want to use the file as it appears on disk.

all (optional)

TRUE if backward unidirectional correspondences should be included in the output. The default is FALSE, in which case only forward and bidirectional correspondences are included.

RETURN

Returns a "Gene Correspondence List" in the form of a reference to a list of lists. If the file's contents are invalid or an error occurs, an undefined value will be returned.

ReverseGeneCorrespondenceRow

    ServerThing::ReverseGeneCorrespondenceRow($row)

Convert a gene correspondence row to represent the converse correspondence. The elements in the row will be reordered to represent a correspondence from the target genome to the source genome.

row

Reference to a list containing a single row from a "Gene Correspondence List".

ValidateGeneCorrespondenceRow

    my $errorCount = ServerThing::ValidateGeneCorrespondenceRow($row);

Validate a gene correspondence row. The numeric fields are checked to insure they are numeric and the source and target gene IDs are validated. The return value will indicate the number of errors found.

row

Reference to a list containing a single row from a "Gene Correspondence List".

RETURN

Returns the number of errors found in the row. A return of 0 indicates the row is valid.

GetCorrespondenceData

    my $corrList = ServerThing::GetCorrespondenceData($genome1, $genome2, $passive, $full);

Return the "Gene Correspondence List" for the specified source and target genomes. If the list is in a file, it will be read. If the file does not exist, it may be created.

genome1

ID of the source genome.

genome2

ID of the target genome.

passive

If TRUE, then the correspondence file will not be created if it does not exist.

full

If TRUE, then both directions of the correspondence will be represented; otherwise, only correspondences from the source to the target (including bidirectional corresopndences) will be included.

RETURN

Returns a "Gene Correspondence List" in the form of a reference to a list of lists, or an undefined value if an error occurs or no file exists and passive mode was specified.

Internal Utility Methods

The methods in this section are used internally by this package.

RunRequest

    ServerThing::RunRequest($cgi, $serverThing, $docURL);

Run a request from the specified server using the incoming CGI parameter object for the parameters.

cgi

CGI query object containing the parameters from the web service request. The significant parameters are as follows.

function

Name of the function to run.

args

Parameters for the function.

encoding

Encoding scheme for the function parameters, either yaml (the default) or json (used by the Java interface).

Certain unusual requests can come in outside of the standard function interface. These are indicated by special parameters that override all the others.

pod

Display a POD documentation module.

code

Display an example code file.

file

Transfer a file (not implemented).

serverThing

Server object against which to run the request.

docURL

URL to use for POD documentation requests.

CreateFile

    ServerThing::CreateFile();

Create a new, empty temporary file and send its name back to the client.

OpenFile

    ServerThing::OpenFile($name);

Send the length of the named file back to the client.

name

##TODO: name description

ReadChunk

    ServerThing::ReadChunk($name, $location, $size);

Read the indicated number of bytes from the specified location of the named file and send them back to the client.

name

##TODO: name description

location

##TODO: location description

size

##TODO: size description

WriteChunk

    ServerThing::WriteChunk($name, $data);

Write the specified data to the named file.

name

##TODO: name description

data

##TODO: data description

LineNumberize

    ServerThing::LineNumberize($module);

Output the module line by line with line numbers

module

Name of the module to line numberized

ProducePod

    ServerThing::ProducePod($module);

Output the POD documentation for the specified module.

module

Name of the module whose POD document is to be displayed.

TraceErrorLog

    ServerThing::TraceErrorLog($name, $errorLog);

Trace the specified error log file. This is a very dinky routine that performs a task required by "RunTool" in multiple places.

name

Name of the tool relevant to the log file.

errorLog

Name of the log file.

SendError

    ServerThing::SendError($message, $status);

Fail an HTTP request with the specified error message and the specified status message.

message

Detailed error message. This is sent as the page content.

status

Status message. This is sent as part of the status code.

Log

    Log($msg);

Write a message to the log. This is a temporary hack until we can figure out how to get normal tracing and error logging working.

msg

Message to write. It will be appended to the servers.log file in the FIG temporary directory.

ServerReturn

ServerReturn is a little class used to encapsulate responses to be sent back toclients. It holds an code code (to be pushed into a HTTP response response), a short message, and long details.