The Initial Attempt to Produce a Metabolic Reconstruction

A metabolic reconstruction refers to an attempt to infer the metabolic machinery of an organism from the sequenced genome and available literature. The term was introduced by Evgeni Selkov in his early work on the first sequenced genomes. Selkov made available his substantial collection of encoded metabolic pathways, and those along with existing encodings (most notably the wonderful pathway charts created by Gerhard Michal and distributed by Boehringer Mannheim) launched numerous efforts to encode the metabolism of sequenced organisms. The major effort by KEGG has become, perhaps, the most well known, and is what the SEED effort has tended to utilize.

Different groups have created slightly differing notions of what is meant by metabolic reconstruction. Within the context of this course, we draw the following distinctions:

  1. By an informal metabolic reconstruction we refer to With informal metabolic reconstructions, it is common to include not only metabolic subsystems (i.e., pathways), but nonmetabolic subsystems, as well.
  2. By a formal metabolic reconstruction we refer to a detailed encoding of the metabolic reaction network of the organism.
That is, the informal reconstruction attempts to represent as much of the cellular machinery as possible. It provides a solid foundation from which the formal metabolic reconstruction can be based. However, the informal metabolic reaction has substantial by itself. There are many aspects of the phentotype that can be inferred by just qualitative reasoning based upon the presence or absence of specific subsystems or functional roles. Further, many aspects of the biochemistry (e.g., "missing genes") can be analyzed from just the perspective of the informal metabolic reconstruction.

The formal is usually limited to just metabolic reactions (and those reactions involving generation or degradation of polymers are normally left out). The output of a formal metabolic reconstruction will include detailed encodings of both the reactions and the compounds that appear in the metabolic network. >p> These distinctions are ours, and are not commonly used. We consider them unimportant, but useful.

In this section of the course, we are asking the student to build both an informal and a formal metabolic reconstruction for some sequenced organism. Clearly this is an ambitious task. It would have been largely impossible to do anything significant 10 years ago, but with the new tools we believe that this effort can be quite productive as an amazing crash course in biochemistry and microbial physiology.

Rather than break this part of the course up into weekly assignments (at least for now), we list the detailed steps we would like the student to work through.

We are going to suggest that each student be assigned a distinct organism (alternatively, groups of students can work jointly on a single organism). We sugesst choosing an organism that fulfills the following criteria:

Steps in the Process of Developing an Informal Metabolic Reconstruction

Getting summaries of what is in the genome

First, you should get two estimates of what cellular machinery is present in the organism:
  1. You should get a list of the subsystems with operational variants from a SEED installation. The easiest way to do this involves starting from the first page of the SEED, asking for Statistics for the genome you are working on, and then (near the bottom of the page) clicking on Show subsystems. Note that the subsystems and genes that you get back may include both well-curated subsystems and poorly-constructed subsystems.
  2. You should get colored versions of the KEGG maps (showing which functions are believed to be present in the genome).

Begin with the Common Machinery

There is a subset of the cellular machinery that will be present in some form in whichever genome you picked. The ribosomal RNA, ribosomal proteins, tRNAs, tRNA synthetases, and so forth must all be there. Look through the set of subsystems that are present, decide what aspects appear to be essential machinery relating to transcription and translation, and begin with that. Create a detailed summary of which topics you have selected, which variants exist, and which genes implement those variants. Which rRNAs and tRNAs exist? How many copies of the rRNA cluster exist?

Studying Amino Acid Synthesis

Next, we suggest amino acid metabolism, or even more restricted the synthesis of amino acids. Identify which of the KEGG maps address this section of metabolism, and then which subsystems from the SEED are relevant. Now prepare a list of the amino acids that can be synthesized, along with the starting point in each case. Make sure that you compose a detailed list of outstanding questions.

Synthesis of Nucleotides

We suggest that you next turn your attention to synthesis of nucleotides. Locate the appropriate KEGG charts and the relevant subsystems. Again, summarize the situation, along with outstanding questions.

Systematically Work Through the Central Cellular Machinery

Between the SEED hierarchy, the KEGG maps, and the numerous examples of metabolic reconstructions published in genome papers, you have numerous examples of the basic components of a functional hierarchy. You should choose a reasonable organizational style and produce an HTML document comprising your best effort at an informal metabolic reconstruction.

The Basic Steps in Building a Formal Metabolic Reconstruction

You should begin by studying exactly how Bernhard Palsson and his team have built formal metabolic reconstructions: You are being asked to construct a list of several hundred reactions, where each reaction includes precise substrates, products and (possibly) a required enzyme.

Begin from the Informal Metabolic Reconstruction

You should begin from the informal metabolic reconstruction and accumulate the reactions and compounds implied by the operational variants of the subsystems.This can be done by starting from the first page of the SEED, asking for Statistics for the genome you are working on, and then (near the bottom of the page) clicking on Show reactions. This tool produces an initial estimate of the reaction set.

This initial set is far from complete and some of the reactions presented will be encoded improperly. Before continuing let us just ponder what a "complete and accurated formal metabolic reconstruction" would contain:

For the purposes of this class, construction of this initial, crude formal metabolic reconstruction is both the best you can do and a major achievement. To refine it into a useful and accurate summary of the metabolism of the cell is something that a person might work a lifetime on.


The object of this portion of the class will be to development both an initial informal metabolic reconstruction and a formal metabolic reconstruction for some specific organism. If you can successfully achieve this, you will have done something that was almost impossible even a few years ago. If you study and reflect on what you accomplish, it will form a starting point for deepening your understanding of microbial physiology and biochemistry.