To make public gene expression data more accessible to biomedical researchers without computational expertise, scientists from the National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health, have developed a free online platform that uses a crowdsourcing approach.
They describe the platform, called OMics Compendia Commons (OMiCC), in the June 20 online issue of Nature Biotechnology
‘As the OMics Compendia Commons (OMiCC) user community grows, the platform will develop into a rich resource that can transform the increasing amounts of public data into novel biological insights.’
Public databases contain millions of gene expression profiles--data that describe the degree to which genes are turned on or off under certain conditions. Potentially, scientists could reuse these data to generate and address new research questions. For example, researchers could re-purpose a dataset comparing blood samples from drug-treated and untreated people to investigate the effects of gender on treatment. However, this wealth of information remains largely untapped for such data reuse, partially because many biologists lack the computer programming expertise needed for data retrieval, processing and analysis. In addition, public database entries typically contain raw study data, which need to be structured for analysis.
OMiCC aims to use crowdsourcing techniques to harness the expertise of the research community to overcome these challenges. Within the platform, users can create groups of gene expression data and "annotate" them by assigning parameters, such as sample type and disease, using a standardized vocabulary. OMiCC saves these user-created groups and associated annotations, making them available to others for reuse.
Within OMiCC, users can pool these groups of data and analyze information from multiple studies to search for biological relationships, a statistical approach known as meta-analysis. The NIAID scientists anticipate that as the OMiCC user community grows, the platform will develop into a rich resource that can transform the increasing amounts of public data into novel biological insights.