A new computational method that dramatically speeds up estimates of gene activity from RNA sequencing (RNA-seq) data has been developed by researchers at Carnegie Mellon University and the University of Maryland.
With the new method, dubbed Sailfish after the famously speedy fish, estimates of gene expression that previously took many hours can be completed in a few minutes, with accuracy that equals or exceeds previous methods. The researchers' report on their new method is being published online April 20 by the journal Nature Biotechnology.
Gigantic repositories of RNA-seq data now exist, making it possible to re-analyze experiments in light of new discoveries. "But 15 hours a pop really starts to add up, particularly if you want to look at 100 experiments," said Carl Kingsford, an associate professor in CMU's Lane Center for Computational Biology. "With Sailfish, we can give researchers everything they got from previous methods, but faster." Though an organism's genetic makeup is static, the activity of individual genes varies greatly over time, making gene expression an important factor in understanding how organisms work and what occurs during disease processes. Gene activity can't be measured directly, but can be inferred by monitoring RNA, the molecules that carry information from the genes for producing proteins and other cellular activities. RNA-seq is a leading method for producing these snapshots of gene expression; in genomic medicine, it has proven particularly useful in analyzing certain cancers.