The Encyclopedia of DNA Elements - or ENCODE reveals complexities of DNA and genes and opens the door to revolutionary treatments for a host of deadly genetic diseases, say researchers.
"This is a major step toward understanding the wiring diagram of a human being," said lead researcher Michael Snyder of Stanford University.
ENCODE has enabled scientists to assign specific biological functions for 80 percent of the human genome and has helped explain how genetic variants affect a person's susceptibility to disease.
It also exposed previously hidden connections between seemingly unrelated diseases such as asthma, lupus and multiple sclerosis which were found to be linked to specific genetic regulatory codes for proteins that regulate the immune system.
A key insight revealed in a host of papers published in the journals Nature, Science and Cell is that many diseases result from changes in when, where and how a gene switches on or off rather than a change to the gene itself.
"Genes occupy only a tiny fraction of the genome, and most efforts to map the genetic causes of disease were frustrated by signals that pointed away from genes," said co-author John Stamatoyannoupoulos, a researcher at the University of Washington.
"Now we know that these efforts were not in vain, and that the signals were in fact pointing to the genome's 'operating system.'"
Another significant finding is that this blueprint of genetic switches can be used to pinpoint cell types that play a role in specific diseases without needing to understand how the disease actually works.
For instance, it took researchers decades to link a set of immune cells with the inflammatory bowel disease Crohn's. The ENCODE data was able to swiftly identify that the genetic variants associated with Crohn's were concentrated in that subset of cells.
This in-depth map of the human genetic code has also altered scientific understanding of how DNA works.
The first sketch of the human genome described DNA as a string which contained genes in isolated sections that make up just two percent of its length.
The space in between was dubbed "junk DNA" and many researchers did not believe it served a function. Attention was focused on the 'coding' genes which carried instructions for making the proteins that carried out basic biological functions.
ENCODE confirmed more recent theories that the bulk of this 'junk' is actually littered with switches that determine how the genes work and act as a massive control panel.
"Our genome is simply alive with switches: millions of places that determine whether a gene is switched on or off," said lead analysis coordinator Ewan Birney of the EMBL-European Bioinformatics Institute.
"We found a much bigger part of the genome -- a surprising amount, in fact -- is involved in controlling when and where proteins are produced, than in simply manufacturing the building blocks."
Perhaps most importantly, the database has been made available to the scientific community -- and the general public -- as an open resource in order to facilitate research.
"ENCODE gives us a set of very valuable leads to follow to discover key mechanisms at play in health and disease," said Ian Dunham of the EMBL-European Bioinformatics Institute, who played a key role in coordinating the analysis.
"Those can be exploited to create entirely new medicines, or to repurpose existing treatments."
The project combined the efforts of 442 scientists in 32 labs in the United States, Britain, Spain, Switzerland, Singapore and Japan.
The researchers used about 300 years worth of computer time to study 147 tissue types and identified over four million different regulatory regions where proteins interact with the DNA.