Researchers have created the first global family tree of metabolic protein architecture and their advance offers a new window on the evolutionary history of metabolism.
Researchers at the University of Illinois published their work that appears this week in the online edition of the Proceedings of the National Academy of Sciences.
Their work relies on reputable techniques of phylogenetic analysis developed in the past decade to plot the progress of genes and organisms but which have never before been used to work out the evolutionary history of protein architecture across biological networks.
"We are interested in how structure evolves, not how organisms evolve," said professor of crop sciences Gustavo Caetano-Anollés, principal researcher on the study, which was co-written by graduate student Hee Shin Kim and emeritus professor of cell and developmental biology Jay E. Mittenthal. "We are using the techniques of phylogenetic analysis that systematicists used to build the tree of life, and we are applying it to a biochemical problem, a systems biology problem."
To get at the roots of protein evolution, the researchers inspected metabolic proteins at the level of their component structures: easily identifiable folds in the proteins that have known enzymatic activities.
These protein domains catalyze a range of functions, breaking down or combining metabolites, small molecules that include the building blocks of all life.
Their findings relied on a primary supposition: that the most extensively utilized protein folds (they looked at proteins in more than 200 species) were also the most ancient.
"Protein architecture has preserved ancient structural designs as fossils of ancient biochemistries," the authors wrote.
The team used data from two international collections of genetic and proteomic information: the metabolic pathways database of the Kyoto Encyclopedia of Genes and Genomes, and the Structural Classification of Proteins database.
They combined these two data sets with phylogenetic reconstructions, or family trees, of protein fold architectures in metabolism. They created a new database, called the Molecular Ancestry Network (MANET) which connects these data sources into a new global network diagram of metabolic pathways.
The researchers added colour, representing evolutionary age, to their diagrams of metabolic networks (for an example, see the purine metabolism network in MANET). The result is a multicolored mosaic of protein fold evolution. The mosaic shows that modern metabolic networks - and even individual enzymes - are composed of both very ancient and much more recent protein architectures.
"This mosaic is telling you that the new enzymes and old enzymes are together performing side by side," Caetano-Anollés said. "In some cases in the same protein you have old domains and new domains working together."
This finding supports the hypothesis that protein architectures that execute one function are often recruited to complete new tasks.
The new, global family tree of protein architecture also exposed that many metabolic protein folds are quite ancient: These architectures were found to be quite common in all the species of bacteria, animals, plants, fungi, protists and archaea the researchers analyzed.
Of 776 metabolic protein folds surveyed, 16 were found to be universal, and nine of those occurred in the earliest branches of the newly constructed tree.
"These nine ancient folds represent architectures of fundamental importance undisputedly encoded in a genetic core that can be traced back to the universal ancestor of the three super kingdoms of life," the authors wrote.
The analysis also found that the most ancient metabolic protein folds are vital to RNA metabolism, particularly the interconversion of the purine and pyrimidine nucleotides that create the core of the RNA molecule.
This discovery supports the hypothesis of an RNA world in which RNA molecules were among the earliest catalysts of life. This idea is based in part on the observation that RNA still retains many of its catalytic capabilities, including the capability to make proteins. Step by step, according to this theory, proteins began taking over some of the original functions of RNA.
"The most ancient (protein) molecules were involved in the interconversion of nucleotides. But they were not synthesizing them," Caetano-Anollés said. "We see that all the enzymes that were involved in purine synthesis, for example, were very recent. Since these first proteins benefited the formation of building blocks for the primitive RNA world, it makes a lot of sense that we've found this origin encased in nucleotide metabolism."