AlignmentTree Entity
An alignment arranges a group of protein sequences so that they match. Each alignment is associated with a phylogenetic tree that describes how the sequences developed and their evolutionary distance. The actual tree and alignment FASTA are stored in separate flat files.
Relationships
AlignmentTree Fields
Name | Type | Notes |
id | string | Unique identifier for this AlignmentTree. |
alignment-method | string | The name of the program used to produce the alignment. |
alignment-parameters | text | The parameters given to the program when producing the alignment. |
alignment-properties | text | A colon-delimited string of key-value pairs containing additional properties of the alignment. |
tree-method | string | The name of the program used to produce the tree. |
tree-parameters | text | The parameters given to the program when producing the tree. |
tree-properties | text | A colon-delimited string of key-value pairs containing additional properties of the tree. |
AlignmentTree Indexes
Table | Name | Type | Fields | Notes |
AlignmentTree | idx0 | | id | Primary index for AlignmentTree. |
Annotation Entity
An annotation is a comment attached to a feature. Annotations are used to track the history of a feature's functional assignments and any related issues. The key is the feature ID followed by a colon and an complemented ten-digit sequence number.
The complemented sequence number causes the annotations to sort with the most recent one first.
Relationships
Annotation Fields
Name | Type | Notes |
id | string | Unique identifier for this Annotation. |
annotation-time | date | Date and time at which the annotation was made. |
annotator | string | Name of the annotator who made the comment. |
comment | text | Text of the annotation. |
Annotation Indexes
Table | Name | Type | Fields | Notes |
Annotation | idx0 | | id | Primary index for Annotation. |
AtomicRegulon Entity
An atomic regulon is an indivisible group of coregulated features on a single genome. Atomic regulons are constructed so that a given feature can only belong to one. Because of this, the expression levels for atomic regulons represent in some sense the state of a cell.
The ID of an atomic regulon is the genome ID, a colon, and the atomic regulon sequence number.
Relationships
- AtomicRegulon IsFormedOf Feature (one-to-many)
- AtomicRegulon IsAffectedIn Experiment (many-to-many)
- AtomicRegulon WasGeneratedFrom Chip (many-to-many)
- AtomicRegulon ReflectsStateOf Genome (many-to-one)
AtomicRegulon Fields
Name | Type | Notes |
id | string | Unique identifier for this AtomicRegulon. |
AtomicRegulon Indexes
Table | Name | Type | Fields | Notes |
AtomicRegulon | idx0 | | id | Primary index for AtomicRegulon. |
Attribute Entity
An attribute describes a category of condition or characteristic for an experiment. The goals of the experiment can be inferred from its values for all the attributes of interest.
Relationships
Attribute Fields
Name | Type | Notes |
id | string | Unique identifier for this Attribute. |
description | text | Descriptive text indicating the nature and use of this attribute. |
Attribute Indexes
Table | Name | Type | Fields | Notes |
Attribute | idx0 | | id | Primary index for Attribute. |
Cluster Entity
A cluster is a set of features that occur close to each other on a genome and are related by a significant degree of functional coupling.
Relationships
Cluster Fields
Name | Type | Notes |
id | int | Unique identifier for this Cluster. |
Cluster Indexes
Table | Name | Type | Fields | Notes |
Cluster | idx0 | | id | Primary index for Cluster. |
Complex Entity
A complex is a set of chemical reactions that act in concert to effect a role. The complex ID is a sequence of letters and numbers.
Relationships
- Complex IsSetOf Reaction (many-to-many)
- Complex IsTriggeredBy Role (many-to-many)
Complex Fields
Name | Type | Notes |
id | string | Unique identifier for this Complex. |
name | string array | name of this complex. Not all complexes have names. |
Complex Indexes
Table | Name | Type | Fields | Notes |
Complex | idx0 | | id | Primary index for Complex. |
Compound Entity
A compound is a chemical that participates in a reaction. All compounds have a unique ID and may also have one or more names. Both ligands and reaction components are treated as compounds.
Relationships
- Compound IsTerminusFor Scenario (many-to-many)
- Compound IsAttractedTo Structure (many-to-many)
- Compound IsInvolvedIn Reaction (many-to-many)
- Compound IsShownOn Diagram (many-to-many)
Compound Fields
Name | Type | Notes |
id | string | Unique identifier for this Compound. |
label | string | Primary name of the compound. |
ubiquitous | boolean | TRUE if this compound is found in most reactions, else FALSE |
Compound Indexes
Table | Name | Type | Fields | Notes |
Compound | idx0 | | label | This index allows searching for compounds by name. |
Compound | idx1 | | id | Primary index for Compound. |
Contig Entity
A contig is a contiguous sequence of base pairs belonging to a single genome. The key of the contig is the genome ID followed by a colon and then the contig ID.
Relationships
- Contig HasSection DNASequence (one-to-many)
- Contig IsLocusFor Feature (many-to-many)
- Contig MakesUp Genome (many-to-one)
Contig Fields
Name | Type | Notes |
id | string | Unique identifier for this Contig. |
length | counter | Number of base pairs in the contig. |
md5-identifier | string | MD5 identifier of this contig, for comparison with contigs in other databases. |
Contig Indexes
Table | Name | Type | Fields | Notes |
Contig | idx0 | | md5-identifier | This index allows searching for contigs by MD5 identifier. |
Contig | idx1 | | id | Primary index for Contig. |
CoregulatedSet Entity
A coregulated set is a group of features that are believed to express up or down in concert. The coregulation is inferred from the experimental expression' data.
Relationships
CoregulatedSet Fields
Name | Type | Notes |
id | hash-string | Unique identifier for this CoregulatedSet. |
reason | text | Description of how this coregulated set was derived. |
CoregulatedSet Indexes
Table | Name | Type | Fields | Notes |
CoregulatedSet | idx0 | | id | Primary index for CoregulatedSet. |
DNASequence Entity
A DNA sequence is a segment of a contig. Contigs are broken into multiple DNA sequences in order to prevent problems that might arise from loading an entire contig into memory when it consists of a billion or more base pairs.
The maximum length of a DNA sequence is one million bases. The key is the contigID followed by a 7-digit ordinal number. So, the first sequence will have an ordinal of 0000000 and contain the base pairs from position 1 to position 1,000,000, the second will have an ordinal of 0000001 and contain the pairs from position 1,000,001 to 2,000,000, and so forth. This will allow us to store 10 trillion base pairs per contig.
Relationships
DNASequence Fields
Name | Type | Notes |
id | string | Unique identifier for this DNASequence. |
sequence | dna | Base pairs that make up this sequence. |
DNASequence Indexes
Table | Name | Type | Fields | Notes |
DNASequence | idx0 | | id | Primary index for DNASequence. |
Diagram Entity
A functional diagram describes a network of chemical reactions, often comprising a single subsystem. A diagram is identified by a short name and contains a longer descriptive name.
Relationships
- Diagram Displays Reaction (many-to-many)
- Diagram IsRelevantFor Subsystem (many-to-many)
- Diagram Shows Compound (many-to-many)
- Diagram IncludesPartOf Scenario (many-to-many)
Diagram Fields
Name | Type | Notes |
id | string | Unique identifier for this Diagram. |
name | text | Descriptive name of this diagram. |
content | image array | The content of the diagram, in PNG format. |
Diagram Indexes
Table | Name | Type | Fields | Notes |
Diagram | idx0 | | id | Primary index for Diagram. |
EcNumber Entity
EC numbers are assigned by the Enzyme Commission, and consist of four numbers separated by periods, each indicating a successively smaller cateogry of enzymes.
Relationships
- EcNumber IsConsistentWith Role (many-to-many)
- EcNumber Categorizes Reaction (many-to-many)
EcNumber Fields
Name | Type | Notes |
id | string | Unique identifier for this EcNumber. |
obsolete | boolean | This boolean indicates when an EC number is obsolete. |
replacedby | string | When an obsolete EC number is replaced with another EC number, this string will hold the name of the replacement EC number. |
EcNumber Indexes
Table | Name | Type | Fields | Notes |
EcNumber | idx0 | | id | Primary index for EcNumber. |
Experiment Entity
An experiment is a combination of conditions for which gene expression information is desired. The result of the experiment is a set of expression levels for features under the given conditions.
Relationships
- Experiment AffectsLevelOf AtomicRegulon (many-to-many)
- Experiment HasValueFor Attribute (many-to-many)
- Experiment OperatesIn Media (many-to-many)
- Experiment IndicatesSignalFor Feature (many-to-many)
- Experiment HasResultsFor Chip (many-to-one)
Experiment Fields
Name | Type | Notes |
id | string | Unique identifier for this Experiment. |
source | string | Publication or lab relevant to this experiment. |
Experiment Indexes
Table | Name | Type | Fields | Notes |
Experiment | idx0 | | id | Primary index for Experiment. |
Family Entity
A family is a group of features united by a particular determination algorithm. The algorithm will frequently-- but not always-- signify a functional role. The key is a character code for the kind of family (generally two or three letters) followed by the family's ID number.
Relationships
- Family HasMember Feature (many-to-many)
- Family IsCoupledTo Family (many-to-many)
- Family IsFamilyFor Role (many-to-many)
- Family IsRepresentedIn Genome (many-to-many)
- Family IsCoupledWith Family (many-to-many)
Family Fields
Name | Type | Notes |
id | string | Unique identifier for this Family. |
family-function | text array | Optional free-form description of the family. For function-based families, this would be the functional role for the family members. For other family types it could be a common name or an alias. |
Family Indexes
Table | Name | Type | Fields | Notes |
Family | idx0 | | id | Primary index for Family. |
FamilyType Entity
This entity contains the current version information for the FIGfams. It has one record that is not connected to any others, and is used for the incremental load to see if FIGdams need to be reloaded. The key is the string "FIGfams".
FamilyType Fields
Name | Type | Notes |
id | string | Unique identifier for this FamilyType. |
version | string | |
FamilyType Indexes
Table | Name | Type | Fields | Notes |
FamilyType | idx0 | | id | Primary index for FamilyType. |
Feature Entity
A feature (sometimes also called a gene) is a part of a genome that is of special interest. Features may be spread across multiple DNA sequences (contigs) of a genome, but never across more than one genome. Each feature in the database has a unique FIG ID that functions as its ID in this table.
Relationships
- Feature HasIndicatedSignalFrom Experiment (many-to-many)
- Feature IsAnnotatedBy Annotation (one-to-many)
- Feature IsAttachmentSiteFor Feature (many-to-many)
- Feature IsCoregulatedWith Feature (many-to-many)
- Feature IsExemplarOf Role (many-to-many)
- Feature IsIdentifiedBy Identifier (many-to-many)
- Feature IsInPair Pairing (many-to-many)
- Feature IsLocatedIn Contig (many-to-many)
- Feature IsRegulatedWith CoregulatedSet (many-to-many)
- Feature OccursIn Cluster (many-to-many)
- Feature IsContainedIn MachineRole (many-to-many)
- Feature IsMemberOf Family (many-to-many)
- Feature HasLevelsFrom Chip (many-to-many)
- Feature HasAttachmentSite Feature (many-to-many)
- Feature HasCoregulationWith Feature (many-to-many)
- Feature IsFormedInto AtomicRegulon (many-to-one)
- Feature HasFunctional Role (many-to-many)
- Feature IsOwnedBy Genome (many-to-one)
- Feature Produces ProteinSequence (many-to-one)
Feature Fields
Name | Type | Notes |
id | string | Unique identifier for this Feature. |
feature-type | string | Code indicating the type of this feature. Among the codes currently supported are "peg" for a protein encoding gene, "bs" for a binding site, "opr" for an operon, and so forth. |
function | text | Functional assignment for this feature. This will often indicate the feature's functional role or roles, and may also have comments. |
locked | boolean | If TRUE, then this feature is locked and its functional role assignment cannot be changed. |
sequence-length | counter | Number of base pairs in this feature. |
essential | link array | A value indicating the essentiality of the feature, In most cases, this will be a word describing whether the essentiality is confirmed (essential) or potential (potential-essential), hyperlinked to the document from which the essentiality was curated. If a feature is not essential, this field will have no values; otherwise, it may have multiple values. |
evidence-code | string array | An evidence code describes the possible evidence that exists for deciding a feature's functional assignment. A feature may have no evidence, a single evidence code, or several. |
link | text array | Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The links are to other websites that have useful about the gene that the feature represents, and are coded as raw HTML, using an anchor href tag. |
virulent | link array | A value indicating the virulence of the feature, coded as HTML. In most cases, this will be a phrase or SA number hyperlinked to the document from which the virulence information was curated. If the feature is not virulent, this field will have no values; otherwise, it may have multiple values. |
Feature Indexes
Table | Name | Type | Fields | Notes |
Feature | idx0 | | id | Primary index for Feature. |
FeatureEvidence | idx0 | | evidence-code, id | This index is used to find features by evidence code. |
Genome Entity
A genome represents a specific organism with DNA, or a specific meta-genome. All DNA sequences in the database belong to genomes.
Relationships
- Genome HasRepresentativeOf Family (many-to-many)
- Genome IsConfiguredBy AtomicRegulon (one-to-many)
- Genome IsMadeUpOf Contig (one-to-many)
- Genome IsModeledBy Model (one-to-many)
- Genome IsOwnerOf Feature (one-to-many)
- Genome Uses MolecularMachine (one-to-many)
- Genome IsCollectedInto GenomeSet (many-to-one)
- Genome IsInTaxa TaxonomicGrouping (many-to-one)
- Genome HadResultsProducedBy Chip (many-to-many)
Genome Fields
Name | Type | Notes |
id | string | Unique identifier for this Genome. |
complete | boolean | TRUE if the genome is complete, else FALSE |
contigs | int | Number of contigs for this genome. |
dna-size | counter | Number of base pairs in the genome. |
domain | string | Domain for this organism (Archaea, Bacteria, Eukaryota, Virus, Plasmid, or Environmental Sample). |
gc-content | float | Percent GC content present in the genome's DNA. |
genetic-code | int | Genetic code number used for protein translation on most of this genome's contigs. |
md5-identifier | string | MD5 identifier for this genome, for comparison with genomes in other databases. |
pegs | int | Number of protein encoding genes for this genome. |
prokaryotic | boolean | TRUE if this is a prokaryotic genome, else FALSE |
rnas | int | Number of RNA features found for this organism. |
scientific-name | string | Full genus/species/strain name of the genome. |
Genome Indexes
Table | Name | Type | Fields | Notes |
Genome | idx0 | | scientific-name | This index allows the applications to find all genomes in lexical order by name. |
Genome | idx1 | | md5-identifier | This index allows searching for genomes by MD5 identifier. |
Genome | idx2 | | id | Primary index for Genome. |
GenomeSet Entity
A genome set is a named group of related genomes.
Each genome set consists of genomes that use highly similar ribosomal small subunits. Two genomes are in the same set if they have a similarity of 97% or greater in subunits of length 1000 or more.
Relationships
GenomeSet Fields
Name | Type | Notes |
id | string | Unique identifier for this GenomeSet. |
GenomeSet Indexes
Table | Name | Type | Fields | Notes |
GenomeSet | idx0 | | id | Primary index for GenomeSet. |
Identifier Entity
An identifier is an alternate name for a feature or a protein sequence.
Identifiers are preferentially associated with features; however, in some cases the precise feature named by an external identifier cannot be computed; in this case, the identifier is associated with the protein sequence. Identifiers are stored in a prefixed format that insures identifiers from different sources have different IDs.
Relationships
- Identifier HasAssertionFrom Source (many-to-many)
- Identifier Identifies Feature (many-to-many)
- Identifier Names ProteinSequence (many-to-many)
Identifier Fields
Name | Type | Notes |
id | string | Unique identifier for this Identifier. |
natural-form | string | Natural form of the identifier. This is how the identifier looks without the identifying prefix (if one is present). |
source | string | Specific type of the identifier, such as its source database or category. The type can usually be decoded to convert the identifier to a URL. |
Identifier Indexes
Table | Name | Type | Fields | Notes |
Identifier | idx0 | | source, natural-form | This index allows all the identifiers of a specified type to be located. |
Identifier | idx1 | | natural-form | This index allows looking up a natural identifier (that is, one without the identifying prefix.. |
Identifier | idx2 | | id | Primary index for Identifier. |
MachineRole Entity
A machine role represents a role as it occurs in a molecular machine. The key is a colon-delimited triple containing an MD5 hash of the subsystem ID followed by a genome ID (with optional region string) and a role abbreviation.
The machine role corresponds to a cell on the subsystem spreadsheet. Features in the subsystem are assigned directly to the machine role.
Relationships
- MachineRole Contains Feature (many-to-many)
- MachineRole IsRoleFor MolecularMachine (many-to-one)
- MachineRole HasRole Role (many-to-one)
MachineRole Fields
Name | Type | Notes |
id | string | Unique identifier for this MachineRole. |
MachineRole Indexes
Table | Name | Type | Fields | Notes |
MachineRole | idx0 | | id | Primary index for MachineRole. |
Model Entity
A model specifies a relationship between sets of features and reactions in a cell. It is used to simulate cell growth and gene knockouts to validate annotations.
Relationships
- Model Models Genome (many-to-one)
- Model Requires Reaction (many-to-many)
Model Fields
Name | Type | Notes |
id | string | Unique identifier for this Model. |
Model Indexes
Table | Name | Type | Fields | Notes |
Model | idx0 | | id | Primary index for Model. |
MolecularMachine Entity
A molecular machine is a collection of features that implements a metabolic pathway. Machines are the physical instances of variants. Each machine corresponds to a row in a subsystem spreadsheet. The key is an MD5 hash formed from a colon-separated list containing the subsystem ID, the variant ID, the Genome ID, and an optional region string.
It is possible for a single subsystem to occur multiple times in a particular genome. If that is the case, then the subsystem will have a molecular machine for each occurrence. The region string used in forming the key insures that each machine has a unique ID.
Relationships
- MolecularMachine IsMachineOf MachineRole (one-to-many)
- MolecularMachine Implements Variant (many-to-one)
- MolecularMachine IsUsedBy Genome (many-to-one)
MolecularMachine Fields
Name | Type | Notes |
id | hash-string | Unique identifier for this MolecularMachine. |
curated | boolean | This flag is TRUE if the assignment of the molecular machine has been curated, and FALSE if it was made by an automated program. |
region | string | Region in the genome for which the machine is relevant. Normally, this is an empty string, indicating that the machine covers the whole genome. If a subsystem has multiple machines for a genome, this contains a location string describing the region occupied by this particular machine. |
MolecularMachine Indexes
Table | Name | Type | Fields | Notes |
MolecularMachine | idx0 | | id | Primary index for MolecularMachine. |
PairSet Entity
A pair set indicates evidence for a functional connection between protein sequence pairs. The protein sequences possessing the connection are the ones that participate in the evidence set's pairings.
The pairings for a particular evidence set will contain protein sequences that are significantly similar. In other words, if (A,B) and (X,Y) are both pairings in a single evidence set, then (A =~ X) and (B =~ Y) or (A =~ Y) and (B =~ X), depending on the value of the "inverted" attribute of the IsDeterminedBy relationship. Essentially, a pairing in its own right is unordered. If (A,B) is a pair, then so is (B,A). However, the evidence set maintains a correspondence between its pairs that _is_ ordered, because the constituent pairs must match. The direction in which a pair matches others in the set is an attribute of the relationship from the pairs to the sets.
Relationships
PairSet Fields
Name | Type | Notes |
id | int | Unique identifier for this PairSet. |
score | int | Score for this evidence set. The score indicates the number of significantly different genomes represented by the pairings. |
PairSet Indexes
Table | Name | Type | Fields | Notes |
PairSet | idx0 | | id | Primary index for PairSet. |
Pairing Entity
A pairing indicates that two features are found close together in a genome. Not all possible pairings are stored in the database; only those that are considered for some reason to be significant for annotation purposes.The key of the pairing is the concatenation of the feature IDs in alphabetical order with an intervening colon.
Theoretically, the pairing is unordered: (A,B) and (B,A) are the same pairing. It is frequently the case, however, that we need to refer to the "first" or "second" protein in the pairing. When this happens, the first one is always the protein with the alphabetically lesser key. The IsInPair relationship automatically shows the proteins in this order.
Relationships
- Pairing Determines PairSet (many-to-one)
- Pairing IsPairOf Feature (many-to-many)
Pairing Fields
Name | Type | Notes |
id | string | Unique identifier for this Pairing. |
Pairing Indexes
Table | Name | Type | Fields | Notes |
Pairing | idx0 | | id | Primary index for Pairing. |
ProteinSequence Entity
A protein sequence is a specific sequence of amino acids. Unlike a DNA sequence, a protein sequence does not belong to a genome. Identical proteins generated by different genomes are generally stored as a single ProteinSequence instance. The key is a hash code computed from the protein letter sequence.
Relationships
- ProteinSequence Exposes Structure (many-to-many)
- ProteinSequence IsNamedBy Identifier (many-to-many)
- ProteinSequence IsProteinFor Feature (one-to-many)
- ProteinSequence ProjectsOnto ProteinSequence (many-to-many)
- ProteinSequence IsAlignedBy AlignmentTree (many-to-many)
- ProteinSequence IsATopicOf Publication (many-to-many)
- ProteinSequence IsProjectedOnto ProteinSequence (many-to-many)
ProteinSequence Fields
Name | Type | Notes |
id | string | Unique identifier for this ProteinSequence. |
sequence | text | The sequence contains the letters corresponding to the protein's amino acids. |
ProteinSequence Indexes
Table | Name | Type | Fields | Notes |
ProteinSequence | idx0 | | id | Primary index for ProteinSequence. |
Publication Entity
A publication is an article or citation that may be used as evidence for assertions made in the database. The key is the PUBMED ID.
Relationships
Publication Fields
Name | Type | Notes |
id | string | Unique identifier for this Publication. |
citation | link | Hyperlink of the article. The text is the article title. |
Publication Indexes
Table | Name | Type | Fields | Notes |
Publication | idx0 | | citation | This index provides the ability to search by article title. It should only be used for LIKE-style searches, since the article titles are encoded together with the URLs. |
Publication | idx1 | | id | Primary index for Publication. |
Reaction Entity
A reaction is a chemical process that converts one set of compounds (substrate) to another set (products). The reaction ID is generally a small number preceded by a letter.
Relationships
- Reaction Involves Compound (many-to-many)
- Reaction IsCategorizedInto EcNumber (many-to-many)
- Reaction IsRequiredBy Model (many-to-many)
- Reaction IsDisplayedOn Diagram (many-to-many)
- Reaction ParticipatesIn Scenario (many-to-many)
- Reaction IsElementOf Complex (many-to-many)
Reaction Fields
Name | Type | Notes |
id | string | Unique identifier for this Reaction. |
Reaction Indexes
Table | Name | Type | Fields | Notes |
Reaction | idx0 | | id | Primary index for Reaction. |
Role Entity
A role describes a biological function that may be fulfilled by a feature. One of the main goals of the database is to assign features to roles. Most roles are effected by the construction of proteins. Some, however, deal with functional regulation and message transmission.
Relationships
- Role IsFunctionalIn Feature (many-to-many)
- Role IsRoleOf MachineRole (one-to-many)
- Role IsIncludedIn Subsystem (many-to-many)
- Role IsConsistentTo EcNumber (many-to-many)
- Role HasAsExemplar Feature (many-to-many)
- Role DeterminesFunctionOf Family (many-to-many)
- Role Triggers Complex (many-to-many)
Role Fields
Name | Type | Notes |
id | string | Unique identifier for this Role. |
hypothetical | boolean | TRUE if a role is hypothetical, else FALSE |
role-index | int array | index of this role in role vectors, or -1 if the role does not appear in role vectors |
Role Indexes
Table | Name | Type | Fields | Notes |
Role | idx0 | | id | Primary index for Role. |
RoleIndex | idx0 | | role-index | This index is used to find roles by role-index. |
Scenario Entity
A scenario is a partial instance of a subsystem with a defined set of reactions. Each scenario converts input compounds to output compounds using reactions. The scenario may use all of the reactions controlled by a subsystem or only some, and may also incorporate additional reactions. Because scenario names are not unique, the actual scenario ID is a number.
Relationships
- Scenario HasParticipant Reaction (many-to-many)
- Scenario Overlaps Diagram (many-to-many)
- Scenario Validates Subsystem (many-to-one)
- Scenario HasAsTerminus Compound (many-to-many)
Scenario Fields
Name | Type | Notes |
id | int | Unique identifier for this Scenario. |
common-name | string | Common name of the scenario. The name, rather than the ID number, is usually displayed everywhere. |
Scenario Indexes
Table | Name | Type | Fields | Notes |
Scenario | idx0 | | common-name | This index allows the user to search for a scenario by name. |
Scenario | idx1 | | id | Primary index for Scenario. |
Source Entity
A source is a user or organization that is permitted to make assertions about identifiers.
Relationships
Source Fields
Name | Type | Notes |
id | string | Unique identifier for this Source. |
Source Indexes
Table | Name | Type | Fields | Notes |
Source | idx0 | | id | Primary index for Source. |
Structure Entity
A structure is the geometrical representation of a protein sequence. A single protein sequence may have multiple structural representations, either because it is folded in different ways or because there are alternative representation formats. The key field is the representation type (currently only PDB) followed by a colon and the ID.
Relationships
- Structure Attracts Compound (many-to-many)
- Structure IsExposedBy ProteinSequence (many-to-many)
Structure Fields
Name | Type | Notes |
id | string | Unique identifier for this Structure. |
Structure Indexes
Table | Name | Type | Fields | Notes |
Structure | idx0 | | id | Primary index for Structure. |
Subsystem Entity
A subsystem is a collection of roles that work together in a cell. Identification of subsystems is an important tool for recognizing parallel genetic features in different organisms. The key is subsystem name.
The subsystem name used to come in two forms-- a natural form with spaces and an internal form with underscores. In this database we will only have the natural form.
Relationships
- Subsystem Describes Variant (one-to-many)
- Subsystem Includes Role (many-to-many)
- Subsystem IsSubInstanceOf Scenario (one-to-many)
- Subsystem IsInClass SubsystemClass (many-to-one)
- Subsystem IsRelevantTo Diagram (many-to-many)
Subsystem Fields
Name | Type | Notes |
id | string | Unique identifier for this Subsystem. |
cluster-based | boolean | TRUE if this is a clustering-based subsystem, else FALSE. A clustering-based subsystem is one in which there is functional-coupling evidence that genes belong together, but we do not yet know what they do. |
curator | string | Name of the person currently in charge of the subsystem. |
description | text | Description of the subsystem's function in the cell. |
experimental | boolean | TRUE if this is an experimental subsystem. An experimental subsystem is designed for investigation and is not yet ready to be used in comparative analysis and annotation. |
notes | text | Descriptive notes about the subsystem. |
private | boolean | TRUE if this is a private subsystem, else FALSE. A private subsystem has valid data, but is not considered ready for general distribution. |
usable | boolean | TRUE if this is a usable subsystem, else FALSE. An unusable subsystem is one that is experimental or is of such low quality that it can negatively affect analysis. |
version | int | Version number for the subsystem. This value is incremented each time the subsystem is backed up. |
Subsystem Indexes
Table | Name | Type | Fields | Notes |
Subsystem | idx0 | | id | Primary index for Subsystem. |
SubsystemClass Entity
Subsystem classes impose a hierarchical organization on the subsystems.
Relationships
- SubsystemClass IsClassFor Subsystem (one-to-many)
- SubsystemClass IsSuperclassOf SubsystemClass (one-to-many)
- SubsystemClass IsSubclassOf SubsystemClass (many-to-one)
SubsystemClass Fields
Name | Type | Notes |
id | string | Unique identifier for this SubsystemClass. |
SubsystemClass Indexes
Table | Name | Type | Fields | Notes |
SubsystemClass | idx0 | | id | Primary index for SubsystemClass. |
TaxonomicGrouping Entity
A taxonomic grouping is a segment of the classification for an organism. Taxonomic groupings are organized into a strict hierarchy by the IsGroupContaining relationship.
Relationships
- TaxonomicGrouping IsGroupFor TaxonomicGrouping (one-to-many)
- TaxonomicGrouping IsTaxonomyOf Genome (one-to-many)
- TaxonomicGrouping IsInGroup TaxonomicGrouping (many-to-one)
TaxonomicGrouping Fields
Name | Type | Notes |
id | counter | Unique identifier for this TaxonomicGrouping. |
domain | boolean | TRUE if this is a domain grouping, else FALSE. |
hidden | boolean | TRUE if this is a hidden grouping, else FALSE. Hidden groupings are not typically shown in a lineage list. |
scientific-name | string | Primary scientific name for this grouping. This is the name used when displaying a taxonomy. |
alias | string array | Alternate name for this grouping. A grouping may have many alternate names. The scientific name should also be in this list. |
TaxonomicGrouping Indexes
Table | Name | Type | Fields | Notes |
TaxonomicGrouping | idx0 | | id | Primary index for TaxonomicGrouping. |
TaxonomicGroupingAlias | idx0 | | alias | This index allows the user to find a particular taxonomic grouping by name. Because the scientifc name is also an alias, there is no index on scientific name. |
Variant Entity
A variant is a functional subset of a subsystem. It indicates the particular sequence of roles used to implement a metabolic pathway. Variants are abstract concepts used to classify machines. The key of the variant is an MD5 hash of the subsystem ID followed by the variant code.
The variant code is generally a number with zero or more decimal points, similar to what it done with software version numbers or legal outline numbers.
Relationships
- Variant IsImplementedBy MolecularMachine (one-to-many)
- Variant IsDescribedBy Subsystem (many-to-one)
Variant Fields
Name | Type | Notes |
id | hash-string | Unique identifier for this Variant. |
code | string | This is the variant code all by itself. |
comment | text | Commentary text about the variant. |
type | string | The variant type indicates the quality of the subsystem support. A type of "vacant" means that the subsystem does not appear to be implemented by the variant. A type of "incomplete" means that the subsystem appears to be missing many reactions. In all other cases, the type is "normal". |
role-rule | text array | A space-delimited list of role IDs, in alphabetical order, that represents a possible list of non-auxiliary roles applicable to this variant. The roles are identified by their abbreviations. A variant may have multiple role rules. |
Variant Indexes
Table | Name | Type | Fields | Notes |
Variant | idx0 | | id | Primary index for Variant. |
AffectsLevelOf Relationship
This relationship indicates the expression level of an atomic regulon for a given experiment.
AffectsLevelOf Fields
Name | Type | Notes |
from-link | string | id of the source Experiment. |
level | int | Indication of whether the feature is expressed (1), not expressed (-1), or unknown (0). |
to-link | string | id of the target AtomicRegulon. |
AffectsLevelOf Indexes
Table | Name | Type | Fields | Notes |
AffectsLevelOf | idxFrom | | from-link | |
AffectsLevelOf | idxTo | | to-link | |
Aligns Relationship
This relationship connects each alignment to its constituent protein sequences. Each alignment contains many protein sequences, and a single sequence can be in many alignments. Parts of a single protein can occur in multiple places in an alignment. The sequence-id field is used to keep these unique, and is the string that represents the sequence in the alignment and tree text.
Aligns Fields
Name | Type | Notes |
begin | counter | location within the sequence at which the aligned portion begins |
end | counter | location within the sequence at which the aligned portion ends |
from-link | string | id of the source AlignmentTree. |
len | counter | length of the sequence within the alignment |
properties | text | additional information about this sequence's participation in the alignment |
sequence-id | string | identifier for this sequence in the alignment |
to-link | string | id of the target ProteinSequence. |
Aligns Indexes
Table | Name | Type | Fields | Notes |
Aligns | idxFrom | | from-link | This index puts additional fields into the primary key of the table to insure that it is unique if a single protein sequence is used twice by the same alignment. |
Aligns | idxTo | | to-link | |
Attracts Relationship
This relationship connects a compound to the protein structures that attract it. This is an incomplete relationship that exists to service drug targeting queries. Only the attractions whose parameters have been determined through modeling or experimentation are included. The goal is to determine the docking energy between the compound and the protein structure.
Attracts Fields
Name | Type | Notes |
electrostatic-energy | float | Docking energy in kcal/mol that results from the movement of electrons (electrostatic force) between the structure and the compound. |
from-link | string | id of the source Structure. |
reason | string | Indication of the reason for determining the docking energy. A value of "Random" indicates the docking was attempted as a part of a random survey used to determine the docking characteristics of a protein structure. A value of "Rich" indicates the docking was attempted because a low-energy docking result was predicted for the compound. |
to-link | string | id of the target Compound. |
tool | string | Name of the tool used to compute the docking energy. |
total-energy | float | Total energy required for the compound to dock with the structure, in kcal/mol. A negative value means energy is released. |
vanderwaals-energy | float | Docking energy in kcal/mol that results from the geometric fit (Van der Waals force) between the structure and the compound. |
Attracts Indexes
Table | Name | Type | Fields | Notes |
Attracts | idxFrom | | from-link, total-energy | This index enables the application to view a structure's docking results from the lowest energy (best docking) to highest energy (worst docking). |
Attracts | idxTo | | to-link, total-energy | This index enables the application to view a compound's docking results from the lowest energy (best docking) to highest energy (worst docking). |
Concerns Relationship
This relationship connects a publication to the protein sequences it describes.
Concerns Fields
Concerns Indexes
Table | Name | Type | Fields | Notes |
Concerns | idxFrom | | from-link | |
Concerns | idxTo | | to-link | |
Contains Relationship
This relationship connects a machine role to the features that occur in it. A feature may occur in many machine roles and a machine role may contain many features. The subsystem annotation process is essentially the maintenance of this relationship.
Contains Fields
Name | Type | Notes |
from-link | string | id of the source MachineRole. |
to-link | string | id of the target Feature. |
Contains Indexes
Table | Name | Type | Fields | Notes |
Contains | idxFrom | | from-link | |
Contains | idxTo | | to-link | |
Describes Relationship
This relationship connects a subsystem to the individual variants used to implement it. Each variant contains a slightly different subset of the roles in the parent subsystem.
Describes Fields
Name | Type | Notes |
from-link | string | id of the source Subsystem. |
to-link | hash-string | id of the target Variant. |
Describes Indexes
Table | Name | Type | Fields | Notes |
Describes | idxFrom | | from-link | |
Describes | idxTo | unique | to-link | |
Displays Relationship
This relationship connects a diagram to its reactions. A diagram shows multiple reactions, and a reaction can be on many diagrams.
Displays Fields
Name | Type | Notes |
from-link | string | id of the source Diagram. |
location | rectangle | Location of the reaction's node on the diagram. |
to-link | string | id of the target Reaction. |
Displays Indexes
Table | Name | Type | Fields | Notes |
Displays | idxFrom | | from-link | |
Displays | idxTo | | to-link | |
Exposes Relationship
This relationship connects a protein sequence to its structural representations. It is a many-to-many relationship. Note that only some protein sequences have known structural representations.
Exposes Fields
Exposes Indexes
Table | Name | Type | Fields | Notes |
Exposes | idxFrom | | from-link | |
Exposes | idxTo | | to-link | |
GeneratedLevelsFor Relationship
This relationship connects an atomic regulon to a chip from which experimental data was produced for its features. It contains a vector of the expression levels.
GeneratedLevelsFor Fields
Name | Type | Notes |
from-link | string | id of the source Chip. |
level-vector | countVector | Vector of expression levels (-1, 0, 1) for the experiments, in sequence order. |
to-link | string | id of the target AtomicRegulon. |
GeneratedLevelsFor Indexes
Table | Name | Type | Fields | Notes |
GeneratedLevelsFor | idxFrom | | from-link | |
GeneratedLevelsFor | idxTo | | to-link | |
HasAssertionFrom Relationship
Sources (users) can make assertions about identifiers using the annotation clearinghouse. When a user makes a new assertion about an identifier, it erases the old one.
HasAssertionFrom Fields
Name | Type | Notes |
expert | boolean | TRUE if this is an expert assertion, else FALSE |
from-link | string | id of the source Identifier. |
function | text | The function is the text of the assertion made about the identifier. |
to-link | string | id of the target Source. |
HasAssertionFrom Indexes
Table | Name | Type | Fields | Notes |
HasAssertionFrom | idxFrom | | from-link | |
HasAssertionFrom | idxTo | | to-link | |
HasIndicatedSignalFrom Relationship
This relationship connects an experiment to a feature. The feature expression levels inferred from the experimental results are stored here.
HasIndicatedSignalFrom Fields
Name | Type | Notes |
from-link | string | id of the source Feature. |
level | int | Indication of whether the feature is expressed (1), not expressed (-1), or unknown (0). |
rma-value | float | Normalized expression value for this feature under the experiment's conditions. |
to-link | string | id of the target Experiment. |
HasIndicatedSignalFrom Indexes
Table | Name | Type | Fields | Notes |
HasIndicatedSignalFrom | idxFrom | | from-link | |
HasIndicatedSignalFrom | idxTo | | to-link | |
HasMember Relationship
This relationship connects each feature family to its constituent features. A family always has many features, and a single feature can be found in many families.
HasMember Fields
Name | Type | Notes |
from-link | string | id of the source Family. |
to-link | string | id of the target Feature. |
HasMember Indexes
Table | Name | Type | Fields | Notes |
HasMember | idxFrom | | from-link | |
HasMember | idxTo | | to-link | |
HasParticipant Relationship
A scenario consists of many participant reactions that convert the input compounds to output compounds. A single reaction may participate in many scenarios.
HasParticipant Fields
Name | Type | Notes |
from-link | int | id of the source Scenario. |
to-link | string | id of the target Reaction. |
type | int | Indicates the type of participaton. If 0, the reaction is in the main pathway of the scenario. If 1, the reaction is necessary to make the model work but is not in the subsystem. If 2, the reaction is part of the subsystem but should not be included in the modelling process. |
HasParticipant Indexes
Table | Name | Type | Fields | Notes |
HasParticipant | idxFrom | | from-link, type | This index presents the reactions in the scenario in order from most important to least important. |
HasParticipant | idxTo | | to-link | |
HasRepresentativeOf Relationship
This relationship connects a genome to the FIGfam protein families for which it has representative proteins. This information can be computed from other relationships, but it is provided explicitly to allow fast access to a genome's FIGfam profile.
HasRepresentativeOf Fields
Name | Type | Notes |
from-link | string | id of the source Genome. |
to-link | string | id of the target Family. |
HasRepresentativeOf Indexes
Table | Name | Type | Fields | Notes |
HasRepresentativeOf | idxFrom | | from-link | |
HasRepresentativeOf | idxTo | | to-link | |
HasResultsIn Relationship
This relationship connects a chip to the experiments that were applied to it.
HasResultsIn Fields
Name | Type | Notes |
from-link | string | id of the source Chip. |
sequence | int | Sequence number of this experiment in the various result vectors. |
to-link | string | id of the target Experiment. |
HasResultsIn Indexes
Table | Name | Type | Fields | Notes |
HasResultsIn | idxFrom | | from-link | |
HasResultsIn | idxTo | unique | to-link, sequence | This index allows you to access the experiments for a chip in sequence order. |
HasSection Relationship
This relationship connects a contig to its actual DNA sequences.
HasSection Fields
Name | Type | Notes |
from-link | string | id of the source Contig. |
to-link | string | id of the target DNASequence. |
HasSection Indexes
Table | Name | Type | Fields | Notes |
HasSection | idxFrom | | from-link | |
HasSection | idxTo | unique | to-link | |
HasValueFor Relationship
This relationship connects an experiment to its attributes. The attribute values are stored here.
HasValueFor Fields
Name | Type | Notes |
from-link | string | id of the source Experiment. |
to-link | string | id of the target Attribute. |
value | string | Value of this attribute in the given experiment. This is always encoded as a string, but may in fact be a number. |
HasValueFor Indexes
Table | Name | Type | Fields | Notes |
HasValueFor | idxFrom | | from-link | |
HasValueFor | idxTo | | to-link | |
Includes Relationship
A subsystem is defined by its roles. The subsystem's variants contain slightly different sets of roles, but all of the roles in a variant must be connected to the parent subsystem by this relationship. A subsystem always has at least one role, and a role always belongs to at least one subsystem.
Includes Fields
Name | Type | Notes |
abbreviation | string | Abbreviation for this role in this subsystem. The abbreviations are used in columnar displays, and they also appear on diagrams. |
auxiliary | boolean | TRUE if this is an auxiliary role, or FALSE if this role is a functioning part of the subsystem. |
from-link | string | id of the source Subsystem. |
sequence | counter | Sequence number of the role within the subsystem. When the roles are formed into a variant, they will generally appear in sequence order. |
to-link | string | id of the target Role. |
Includes Indexes
Table | Name | Type | Fields | Notes |
Includes | idxFrom | | from-link, sequence | This index insures that the roles of the subsystem are presented in sequence order. |
Includes | idxTo | | to-link | |
IndicatedLevelsFor Relationship
This relationship connects a feature to a chip from which experimental data was produced for the feature. It contains a vector of the expression levels.
IndicatedLevelsFor Fields
Name | Type | Notes |
from-link | string | id of the source Chip. |
level-vector | countVector | Vector of expression levels (-1, 0, 1) for the experiments, in sequence order. |
to-link | string | id of the target Feature. |
IndicatedLevelsFor Indexes
Table | Name | Type | Fields | Notes |
IndicatedLevelsFor | idxFrom | | from-link | |
IndicatedLevelsFor | idxTo | | to-link | |
Involves Relationship
This relationship connects a reaction to the compounds that participate in it. A reaction involves many compounds, and a compound can be involved in many reactions. The relationship attributes indicate whether a compound is a product or substrate of the reaction, as well as its stoichiometry.
Involves Fields
Name | Type | Notes |
cofactor | boolean | TRUE if the compound is a cofactor; FALSE if it is a major component of the reaction. |
from-link | string | id of the source Reaction. |
product | boolean | TRUE if the compound is a product of the reaction, FALSE if it is a substrate. When a reaction is written on paper in chemical notation, the substrates are left of the arrow and the products are to the right. Sorting on this field will cause the substrates to appear first, followed by the products. If the reaction is reversible, then the notion of substrates and products is not intuitive; however, a value here of FALSE still puts the compound left of the arrow and a value of TRUE still puts it to the right. |
stoichiometry | float | Number of molecules of the compound that participate in a single instance of the reaction. For example, if a reaction produces two water molecules, the stoichiometry of water for the reaction would be two. When a reaction is written on paper in chemical notation, the stoichiometry is the number next to the chemical formula of the compound. |
to-link | string | id of the target Compound. |
Involves Indexes
Table | Name | Type | Fields | Notes |
Involves | idxFrom | | from-link | |
Involves | idxTo | | to-link, product | This index presents the compounds in the reaction in the order they should be displayed when writing it in chemical notation. All the substrates appear before all the products, and within that ordering, the main compounds appear first. |
IsAnnotatedBy Relationship
This relationship connects a feature to its annotations. A feature may have multiple annotations, but an annotation belongs to only one feature.
IsAnnotatedBy Fields
Name | Type | Notes |
from-link | string | id of the source Feature. |
to-link | string | id of the target Annotation. |
IsAnnotatedBy Indexes
Table | Name | Type | Fields | Notes |
IsAnnotatedBy | idxFrom | | from-link | |
IsAnnotatedBy | idxTo | unique | to-link | |
IsAttachmentSiteFor Relationship
This relationship connects an attachment site to the feature it attaches.
This is an old table that will probably be replaced by the regulon-related entities and relationships.
IsAttachmentSiteFor Fields
Name | Type | Notes |
edge | char | Edge of the attached feature relevant to the site: L for the left edge, R for the right. |
from-link | string | id of the source Feature. |
to-link | string | id of the target Feature. |
IsAttachmentSiteFor Indexes
Table | Name | Type | Fields | Notes |
IsAttachmentSiteFor | idxFrom | | from-link | |
IsAttachmentSiteFor | idxTo | | to-link | |
IsCategorizedInto Relationship
This relationship connects an EC number to reactions that are consistent with the chemistry described by the EC class.
IsCategorizedInto Fields
Name | Type | Notes |
from-link | string | id of the source Reaction. |
source | string | This is the source of the EC number mapping for this reaction. |
to-link | string | id of the target EcNumber. |
IsCategorizedInto Indexes
Table | Name | Type | Fields | Notes |
IsCategorizedInto | idxFrom | | from-link | |
IsCategorizedInto | idxTo | | to-link | |
IsClassFor Relationship
This relationship connects each subsystem class with the subsystems that belong to it. A class can contain many subsystems, but a subsystem is only in one class. Some subsystems are not in any class, but this is usually a temporary condition.
IsClassFor Fields
IsClassFor Indexes
Table | Name | Type | Fields | Notes |
IsClassFor | idxFrom | | from-link | |
IsClassFor | idxTo | unique | to-link | |
IsCollectionOf Relationship
A genome belongs to only one genome set. For each set, this relationship marks the genome to be used as its representative.
IsCollectionOf Fields
Name | Type | Notes |
from-link | string | id of the source GenomeSet. |
representative | boolean | TRUE for the representative genome of the set, else FALSE. |
to-link | string | id of the target Genome. |
IsCollectionOf Indexes
Table | Name | Type | Fields | Notes |
IsCollectionOf | idxFrom | | from-link | |
IsCollectionOf | idxTo | unique | to-link | |
IsConsistentWith Relationship
This relationship connects a functional role to the EC numbers consistent with the chemistry described in the role.
IsConsistentWith Fields
Name | Type | Notes |
from-link | string | id of the source EcNumber. |
to-link | string | id of the target Role. |
IsConsistentWith Indexes
Table | Name | Type | Fields | Notes |
IsConsistentWith | idxFrom | | from-link | |
IsConsistentWith | idxTo | | to-link | |
IsCoregulatedWith Relationship
This relationship connects a feature with another feature in the same genome with which it appears to be coregulated as a result of expression data analysis.
IsCoregulatedWith Fields
Name | Type | Notes |
coefficient | float | Pearson correlation coefficient for this coregulation. |
from-link | string | id of the source Feature. |
to-link | string | id of the target Feature. |
IsCoregulatedWith Indexes
Table | Name | Type | Fields | Notes |
IsCoregulatedWith | idxFrom | | from-link | |
IsCoregulatedWith | idxTo | | to-link | |
IsCoupledTo Relationship
This relationship connects two FIGfams that we believe to be related either because their members occur in proximity on chromosomes or because the members are expressed together. Such a relationship is evidence the functions of the FIGfams are themselves related. This relationship is commutative; only the instance in which the first FIGfam has a lower ID than the second is stored.
IsCoupledTo Fields
Name | Type | Notes |
co-expression-evidence | counter | number of times members of the two FIGfams are co-expressed in expression data experiments |
co-occurrence-evidence | counter | number of times members of the two FIGfams occur close to each other on chromosomes |
from-link | string | id of the source Family. |
to-link | string | id of the target Family. |
IsCoupledTo Indexes
Table | Name | Type | Fields | Notes |
IsCoupledTo | idxFrom | | from-link | |
IsCoupledTo | idxTo | | to-link | |
IsDeterminedBy Relationship
A functional coupling evidence set exists because it has pairings in it, and this relationship connects the evidence set to its constituent pairings. A pairing cam belong to multiple evidence sets.
IsDeterminedBy Fields
Name | Type | Notes |
from-link | int | id of the source PairSet. |
inverted | boolean | A pairing is an unordered pair of protein sequences, but its similarity to other pairings in a pair set is ordered. Let (A,B) be a pairing and (X,Y) be another pairing in the same set. If this flag is FALSE, then (A =~ X) and (B =~ Y). If this flag is TRUE, then (A =~ Y) and (B =~ X). |
to-link | string | id of the target Pairing. |
IsDeterminedBy Indexes
Table | Name | Type | Fields | Notes |
IsDeterminedBy | idxFrom | | from-link | |
IsDeterminedBy | idxTo | unique | to-link | |
IsExemplarOf Relationship
This relationship links a role to a feature that provides a typical example of how the role is implemented.
IsExemplarOf Fields
Name | Type | Notes |
from-link | string | id of the source Feature. |
to-link | string | id of the target Role. |
IsExemplarOf Indexes
Table | Name | Type | Fields | Notes |
IsExemplarOf | idxFrom | | from-link | |
IsExemplarOf | idxTo | | to-link | |
IsFamilyFor Relationship
This relationship connects an isofunctional family to the roles that make up its assigned function.
IsFamilyFor Fields
Name | Type | Notes |
from-link | string | id of the source Family. |
to-link | string | id of the target Role. |
IsFamilyFor Indexes
Table | Name | Type | Fields | Notes |
IsFamilyFor | idxFrom | | from-link | |
IsFamilyFor | idxTo | | to-link | |
IsFunctionalIn Relationship
This relationship connects a role with the features in which it plays a functional part.
In most cases, the functional assignment of a feature is also its role; however, some features implement multiple roles, and each individual role is connected by a separate instance of this relationship.
IsFunctionalIn Fields
Name | Type | Notes |
from-link | string | id of the source Role. |
to-link | string | id of the target Feature. |
IsFunctionalIn Indexes
Table | Name | Type | Fields | Notes |
IsFunctionalIn | idxFrom | | from-link | |
IsFunctionalIn | idxTo | | to-link | |
IsGroupFor Relationship
The recursive IsGroupContaining relationship organizes taxonomic groupings into a hierarchy based on the standard organism taxonomy.
IsGroupFor Fields
IsGroupFor Indexes
Table | Name | Type | Fields | Notes |
IsGroupFor | idxFrom | | from-link | |
IsGroupFor | idxTo | unique | to-link | |
IsIdentifiedBy Relationship
The normal case is that an identifier identifies a single feature, while a feature can have many identifiers. Some identifiers, however, are ambiguous and will connect to many features.
IsIdentifiedBy Fields
Name | Type | Notes |
conf | char | Confidence code for this alias. "A" indicates it has been curated. other codes indicate less confidence. |
from-link | string | id of the source Feature. |
to-link | string | id of the target Identifier. |
IsIdentifiedBy Indexes
Table | Name | Type | Fields | Notes |
IsIdentifiedBy | idxFrom | | from-link | |
IsIdentifiedBy | idxTo | | to-link | |
IsImplementedBy Relationship
This relationship connects a variant to the physical machines that implement it in the genomes. A variant is implemented by many machines, but a machine belongs to only one variant.
IsImplementedBy Fields
Name | Type | Notes |
from-link | hash-string | id of the source Variant. |
to-link | hash-string | id of the target MolecularMachine. |
IsImplementedBy Indexes
Table | Name | Type | Fields | Notes |
IsImplementedBy | idxFrom | | from-link | |
IsImplementedBy | idxTo | unique | to-link | |
IsInPair Relationship
A pairing contains exactly two protein sequences. A protein sequence can belong to multiple pairings. When going from a protein sequence to its pairings, they are presented in alphabetical order by sequence key.
IsInPair Fields
Name | Type | Notes |
from-link | string | id of the source Feature. |
to-link | string | id of the target Pairing. |
IsInPair Indexes
Table | Name | Type | Fields | Notes |
IsInPair | idxFrom | | from-link | |
IsInPair | idxTo | | to-link | |
IsLocatedIn Relationship
A feature is a set of DNA sequence fragments. Most features are a single contiquous fragment, so they are located in only one DNA sequence; however, fragments have a maximum length, so even a single contiguous feature may participate in this relationship multiple times. A few features belong to multiple DNA sequences. In that case, however, all the DNA sequences belong to the same genome. A DNA sequence itself will frequently have thousands of features connected to it.
IsLocatedIn Fields
Name | Type | Notes |
begin | int | Index (1-based) of the first residue in the contig that belongs to the segment. |
dir | char | Direction (strand) of the segment: "+" if it is forward and "-" if it is backward. |
from-link | string | id of the source Feature. |
len | int | Length of this segment. |
ordinal | int | Sequence number of this segment, starting from 1 and proceeding sequentially forward from there. |
to-link | string | id of the target Contig. |
IsLocatedIn Indexes
Table | Name | Type | Fields | Notes |
IsLocatedIn | idxFrom | | from-link, ordinal | This index allows the application to find all the segments of a feature in the proper order. |
IsLocatedIn | idxTo | | to-link, begin | This index is the one used by applications to find all the feature segments that contain a specific residue. |
IsMachineOf Relationship
This relationship connects a molecular machine to its various machine roles. Each machine has many machine roles, but each machine role belongs to only one machine.
IsMachineOf Fields
IsMachineOf Indexes
Table | Name | Type | Fields | Notes |
IsMachineOf | idxFrom | | from-link | |
IsMachineOf | idxTo | unique | to-link | |
IsMadeUpOf Relationship
This relationship connects each genome to the DNA sequences that make it up.
IsMadeUpOf Fields
Name | Type | Notes |
from-link | string | id of the source Genome. |
to-link | string | id of the target Contig. |
IsMadeUpOf Indexes
Table | Name | Type | Fields | Notes |
IsMadeUpOf | idxFrom | | from-link | |
IsMadeUpOf | idxTo | unique | to-link | |
IsModeledBy Relationship
A genome can be modeled by many different models, but a model belongs to only one genome.
IsModeledBy Fields
Name | Type | Notes |
from-link | string | id of the source Genome. |
to-link | string | id of the target Model. |
IsModeledBy Indexes
Table | Name | Type | Fields | Notes |
IsModeledBy | idxFrom | | from-link | |
IsModeledBy | idxTo | unique | to-link | |
IsNamedBy Relationship
The normal case is that an identifier names a single protein sequence, while a protein sequence can have many identifiers, but some identifiers name multiple sequences.
This relationship is populated with data submitted to the annotation clearinghouse as well as external IDs in the non-redundant database.
IsNamedBy Fields
IsNamedBy Indexes
Table | Name | Type | Fields | Notes |
IsNamedBy | idxFrom | | from-link | |
IsNamedBy | idxTo | | to-link | |
IsOwnerOf Relationship
This relationship connects each feature to its parent genome.
IsOwnerOf Fields
Name | Type | Notes |
from-link | string | id of the source Genome. |
to-link | string | id of the target Feature. |
IsOwnerOf Indexes
Table | Name | Type | Fields | Notes |
IsOwnerOf | idxFrom | | from-link | |
IsOwnerOf | idxTo | unique | to-link | |
IsProteinFor Relationship
This relationship connects a peg feature to the protein sequence it produces (if any). Only peg features participate in this relationship. A single protein sequence will frequently be produced by many features.
IsProteinFor Fields
IsProteinFor Indexes
Table | Name | Type | Fields | Notes |
IsProteinFor | idxFrom | | from-link | |
IsProteinFor | idxTo | unique | to-link | |
IsRegulatedWith Relationship
This relationship connects a feature to the set of coregulated features.
IsRegulatedWith Fields
Name | Type | Notes |
from-link | string | id of the source Feature. |
to-link | hash-string | id of the target CoregulatedSet. |
IsRegulatedWith Indexes
Table | Name | Type | Fields | Notes |
IsRegulatedWith | idxFrom | | from-link | |
IsRegulatedWith | idxTo | | to-link | |
IsRelevantFor Relationship
This relationship connects a diagram to the subsystems that are depicted on it. Only diagrams which are useful in curating or annotation the subsystem are specified in this relationship.
IsRelevantFor Fields
Name | Type | Notes |
from-link | string | id of the source Diagram. |
to-link | string | id of the target Subsystem. |
IsRelevantFor Indexes
Table | Name | Type | Fields | Notes |
IsRelevantFor | idxFrom | | from-link | |
IsRelevantFor | idxTo | | to-link | |
IsRequiredBy Relationship
This relationship specifies the reactions that occur in the model.
IsRequiredBy Fields
Name | Type | Notes |
from-link | string | id of the source Reaction. |
to-link | string | id of the target Model. |
IsRequiredBy Indexes
Table | Name | Type | Fields | Notes |
IsRequiredBy | idxFrom | | from-link | |
IsRequiredBy | idxTo | | to-link | |
IsRoleOf Relationship
This relationship connects a role to the machine roles that represent its appearance in a molecular machine. A machine role has exactly one associated role, but a role may be represented by many machine roles.
IsRoleOf Fields
Name | Type | Notes |
from-link | string | id of the source Role. |
to-link | string | id of the target MachineRole. |
IsRoleOf Indexes
Table | Name | Type | Fields | Notes |
IsRoleOf | idxFrom | | from-link | |
IsRoleOf | idxTo | unique | to-link | |
IsSetOf Relationship
A complex consists of many reactions and a reaction can be in multiple complexes.
IsSetOf Fields
Name | Type | Notes |
from-link | string | id of the source Complex. |
to-link | string | id of the target Reaction. |
IsSetOf Indexes
Table | Name | Type | Fields | Notes |
IsSetOf | idxFrom | | from-link | |
IsSetOf | idxTo | | to-link | |
IsSubInstanceOf Relationship
This relationship connects a scenario to its subsystem it validates. A scenario belongs to exactly one subsystem, but a subsystem may have multiple scenarios.
IsSubInstanceOf Fields
Name | Type | Notes |
from-link | string | id of the source Subsystem. |
to-link | int | id of the target Scenario. |
IsSubInstanceOf Indexes
Table | Name | Type | Fields | Notes |
IsSubInstanceOf | idxFrom | | from-link | |
IsSubInstanceOf | idxTo | unique | to-link | |
IsSuperclassOf Relationship
This is a recursive relationship that imposes a hierarchy on the subsystem classes.
IsSuperclassOf Fields
IsSuperclassOf Indexes
Table | Name | Type | Fields | Notes |
IsSuperclassOf | idxFrom | | from-link | |
IsSuperclassOf | idxTo | unique | to-link | |
IsTaxonomyOf Relationship
A genome is assigned to a particular point in the taxonomy tree, but not necessarily to a leaf node. In some cases, the exact species and strain is not available when inserting the genome, so it is placed at the lowest node that probably contains the actual genome.
IsTaxonomyOf Fields
IsTaxonomyOf Indexes
Table | Name | Type | Fields | Notes |
IsTaxonomyOf | idxFrom | | from-link | |
IsTaxonomyOf | idxTo | unique | to-link | |
IsTerminusFor Relationship
A terminus for a scenario is a compound that acts as its input or output. A compound can be the terminus for many scenarios, and a scenario will have many termini. The relationship attributes indicate whether the compound is an input to the scenario or an output. In some cases, there may be multiple alternative output groups. This is also indicated by the attributes.
IsTerminusFor Fields
Name | Type | Notes |
from-link | string | id of the source Compound. |
group-number | int | If zero, then the compound is an input. If one, the compound is an output. If two, the compound is an auxiliary output. |
to-link | int | id of the target Scenario. |
IsTerminusFor Indexes
Table | Name | Type | Fields | Notes |
IsTerminusFor | idxFrom | | from-link | |
IsTerminusFor | idxTo | | to-link, group-number | This index allows the application to view a scenario's compounds by group. |
IsTriggeredBy Relationship
A complex can be triggered by many roles. A role can trigger many complexes.
IsTriggeredBy Fields
Name | Type | Notes |
from-link | string | id of the source Complex. |
optional | boolean | TRUE if the role is not necessarily required to trigger the complex, else FALSE |
to-link | string | id of the target Role. |
IsTriggeredBy Indexes
Table | Name | Type | Fields | Notes |
IsTriggeredBy | idxFrom | | from-link | |
IsTriggeredBy | idxTo | | to-link | |
OccursIn Relationship
This relationship connects features to clusters. It is generally expected that the presence of features in a cluster indicates a functional relationship.
OccursIn Fields
Name | Type | Notes |
from-link | string | id of the source Feature. |
to-link | int | id of the target Cluster. |
OccursIn Indexes
Table | Name | Type | Fields | Notes |
OccursIn | idxFrom | | from-link | |
OccursIn | idxTo | | to-link | |
OperatesIn Relationship
This relationship connects an experiment to the media in which the experiment took place.
OperatesIn Fields
Name | Type | Notes |
from-link | string | id of the source Experiment. |
to-link | string | id of the target Media. |
OperatesIn Indexes
Table | Name | Type | Fields | Notes |
OperatesIn | idxFrom | | from-link | |
OperatesIn | idxTo | | to-link | |
Overlaps Relationship
A Scenario overlaps a diagram when the diagram displays a portion of the reactions that make up the scenario. A scenario may overlap many diagrams, and a diagram may be include portions of many scenarios.
Overlaps Fields
Name | Type | Notes |
from-link | int | id of the source Scenario. |
to-link | string | id of the target Diagram. |
Overlaps Indexes
Table | Name | Type | Fields | Notes |
Overlaps | idxFrom | | from-link | |
Overlaps | idxTo | | to-link | |
ProducedResultsFor Relationship
This relationship connects a chip to a genome for which it was used to produce experimental results. In general, a chip is used for only one genome and vice versa, but this is not a requirement.
ProducedResultsFor Fields
Name | Type | Notes |
from-link | string | id of the source Chip. |
to-link | string | id of the target Genome. |
ProducedResultsFor Indexes
Table | Name | Type | Fields | Notes |
ProducedResultsFor | idxFrom | | from-link | |
ProducedResultsFor | idxTo | | to-link | |
ProjectsOnto Relationship
This relationship connects two protein sequences for which a clear bidirectional best hit exists in known genomes. The attributes of the relationship describe how good the relationship is between the proteins. The relationship is bidirectional and symmetric, but is only stored in one direction (lower ID to higher ID).
ProjectsOnto Fields
Name | Type | Notes |
from-link | string | id of the source ProteinSequence. |
gene-context | counter | number of homologous genes in the immediate context of the two proteins, up to a maximum of 10 |
percent-identity | float | percent match between the two protein sequences |
score | float | score describing the strength of the projection, from 0 to 1, where 1 is the best |
to-link | string | id of the target ProteinSequence. |
ProjectsOnto Indexes
Table | Name | Type | Fields | Notes |
ProjectsOnto | idxFrom | | from-link | |
ProjectsOnto | idxTo | | to-link | |
Shows Relationship
This relationship indicates that a compound appears on a particular diagram. The same compound can appear on many diagrams, and a diagram always contains many compounds.
Shows Fields
Name | Type | Notes |
from-link | string | id of the source Diagram. |
location | rectangle | Location of the compound's node on the diagram. |
to-link | string | id of the target Compound. |
Shows Indexes
Table | Name | Type | Fields | Notes |
Shows | idxFrom | | from-link | |
Shows | idxTo | | to-link | |
Uses Relationship
This relationship connects a genome to the machines that form its metabolic pathways. A genome can use many machines, but a machine is used by exactly one genome.
Uses Fields
Name | Type | Notes |
from-link | string | id of the source Genome. |
to-link | hash-string | id of the target MolecularMachine. |
Uses Indexes
Table | Name | Type | Fields | Notes |
Uses | idxFrom | | from-link | |
Uses | idxTo | unique | to-link | |
BelongsTo Shape
This relationship is not physically implemented in the database. It is implicit in the data for a variant. A variant contains a boolean expression that describes the various combinations of roles it can contain.
The Sapling database is a distributable, self-contained copy of the NMPDR data. Unlike Sprout, which is optimized for searching, Sapling is designed to be structurally simple without sacrificing the ability to find information quickly.
The diagram colors indicate the way the data is loaded: each color indicates a different load group.
- Red Genome group: includes all taxonomy and sequence data.
- Blue Subsystem group: includes the subsystems themselves and all the subsystem spreadsheets.
- Green Feature group: includes features, feature identifiers, protein sequences, and related publications.
- Yellow Scenario group: includes KEGG diagrams, scenarios, and connections to the related chemical reactions.
- Purple Alignment group: includes protein alignments and trees.
- Navy Model group: includes compounds, reactions, structural cues, and models for the cell chemistry.
- Brown Family group: includes FIGfams and functional coupling data.
- Black Protein group: includes annotation history and data from the Annotation Clearinghouse.
- Cyan Expression group: includes data relating to gene expression and cell regulation.