The discovery of shared biological properties among independent variants of DNA sequences offers the opportunity to broaden understanding of the biological basis of disease and identify new therapeutic targets.
This is according to a collaboration between the Perelman School of Medicine at the University of Pennsylvania, the University of Arizona Health Sciences, and Vanderbilt University. The group published their findings this month in npj Genomic Medicine
‘DNA variants associated with disease risk can affect biological activities as gene expression and the function of proteins in cellular house-keeping machinery.’
Drugs can have variable effects on people depending on small natural differences in the sequence of DNA between individuals. These genetic differences are called SNPs, or single nucleotide polymorphisms, and are variants in the DNA alphabet of A, T, C, and G molecules that occur naturally among individuals. Many such SNPs have been associated with disease risk, for instance showing that a person with an A at a given location in the DNA sequence has a higher risk of diabetes compared with someone with a G. However, these disease-related SNPs often reside in the so-called "dark matter" of the genome that does not directly code for genes, but does include switches that control gene expression.
Over the last ten years, researchers have conducted genome wide association studies (GWAS) to map DNA variants across thousands of genomes from individuals to find which variants are more frequent in people with a certain disease. For such common, complex diseases as diabetes or cancer, GWAS have identified hundreds such variants. On the other hand, GWAS have found that many disease-associated variants do not alter the function of genes in an obvious way, making some variants difficult for immediate clinical interpretation.
Senior author Jason H. Moore, PhD, the Edward Rose Professor of Informatics and director of the Institute for Biomedical Informatics and colleagues Yves A. Lussier, Haiquan Li, Ikbel Achour, and Joshua C. Denny have developed a computational method to explore the downstream effects of variants associated with risk to reveal possible mechanisms of disease. "Our results provide a 'roadmap' of disease mechanisms emerging from GWAS to identify candidate therapeutic targets," Moore said.
In the current paper, the team demonstrated that variants associated with disease risk can affect such biological activities as gene expression and the function of proteins in cellular house-keeping machinery. "Taking this all together a more comprehensive picture of disease biology is emerging," Moore said. "This picture - up to now - has been blurry, especially when variants occurred between genes."
The team used computational modelling of two million pairs of disease-associated SNPs drawn from three GWAS projects, as well as information from other genome databases that match a patient's individual genetic makeup to their outward symptoms. From this, they predicted 3,870 SNP pairs with a similar biological mechanism. These prioritized SNP pairs, with overlapping messenger RNA targets or similar functions, were more likely to be associated with the same disease than unrelated pathologies.
Specifically, using a subset of the prioritized SNP pairs in independent studies of Alzheimer's disease, bladder cancer, and rheumatoid arthritis patient data, they showed that two variants can contribute to disease independently, but also interact genetically. "From this we determined that the precise combination of DNA variants in a patient may synergistically increase or antagonistically decrease one's relative risk of disease," Moore said.
Using data sets from the Encyclopedia of DNA Elements (ENCODE) - a project designed to discover the bits of DNA that have a biological function related to a gene - the team validated that the biological mechanisms of disease shared within the prioritized SNP pairs are frequently governed by matching transcription factor binding sites and interactions of segments of chromatin that are seemingly far apart and not related.
Chromatin is the combination of packaged DNA, protein, and RNA found in the nucleus. It functions to compact DNA, facilitate cell replication, prevent DNA, and control gene expression. Now, the team is refining their methods to identify those genetic risk factors that have been overlooked by ignoring the biological context of their effects on common diseases.