The leading personal genetics company, 23andMe, said on Monday that it has published an analysis that can improve the accuracy and efficiency of identity-by-descent detection through a new, open-source algorithm called HaploScore.
The study, titled "Reducing pervasive false positive identical-by-descent segments detected by large-scale pedigree analysis" was published on April 30, 2014 in Molecular Biology and Evolution
HaploScore provides a metric by which to rank the likelihood that a stretch of DNA is inherited IBD between two individuals or not. Analysis of genomic segments shared IBD between individuals is fundamental to many genetic applications, from demographic inference to estimating the heritability of diseases and identifying distant relatives, but IBD detection accuracy in non-simulated data was previously largely unknown.
To determine the accuracy of existing IBD detection algorithms, researchers extracted 25,432 genotyped European individuals containing 2,952 father-mother-child trios from the 23andMe, Inc. dataset. The team then used GERMLINE, a widely used IBD detection method, to detect IBD segments within this cohort and identified a false positive rate over 67 percent for short (2 to 4 centiMorgan) segments, arising primarily from the allowance of DNA phasing errors when detecting IBD which is necessary for retrieving long (> 6 centiMorgan) segments. The team then replicated the false IBD findings in an external dataset and introduced the HaploScore algorithm to improve the accuracy of short IBD segments while retaining long segments.
Because the open-source HaploScore algorithm can be applied to existing IBD segments, its introduction will differentiate between true and false reported IBD segments detected by any method to improve accuracy. The usage of IBD segments in genetic analyses will become increasingly common as the number of individuals with their genetic composition known increases.
"Identifying these false positives and creating the HaploScore solution will allow us to improve IBD detection and DNA phasing," said Cory McLean, Ph.D., study author and 23andMe computational biologist. "Improved IBD detection and DNA phasing will allow all researchers to more accurately identify genetic relationships between distantly-related individuals and allow for improved ancestry reports within 23andMe."