St. Jude Children's Research Hospital - Washington University Pediatric Cancer Genome Project investigators have developed a dramatically better computer tool for finding the genetic missteps that fuel cancer.
Researchers are using the new algorithm to help identify the chromosomal rearrangements and DNA insertions or deletions unique to cancer.
The new computational method is known as CREST, short for Clipping Reveals Structure. Using CREST, researchers identified 89 new structural differences in the cancer genomes of five St. Jude patients with a subtype of acute lymphoblastic leukemia (ALL) known as T-lineage ALL. CREST revealed complex chromosomal rearrangements, including one that involved four chromosomes. Investigators also used the tool to find 50 new variations in melanoma cells. Melanoma is the most serious form of skin cancer. The study appears in the June 12 advance online edition of the scientific journal Nature Methods.
"CREST is significantly more accurate and sensitive than existing methods of finding structural variations in next-generation sequencing data. It finds differences between a patient's normal and cancer genomes other tools cannot find," said Jinghui Zhang, Ph.D., an associate member of the St. Jude Department of Computational Biology. She is the study's senior author. "Similar tools miss up to 60 to 70 percent of these structural rearrangements in tumors. CREST ensures that scientists will be able to find important structural variations that play critical roles in tumor formation."
Zhang said the need for new ways to identify the genomic variations that lead to cancer became clear shortly after the genome project began. The St. Jude - Washington University collaboration was launched in 2010 to sequence and compare the complete normal and cancer genomes of 600 young patients battling some of the most challenging forms of the disease. Organizers expect the three-year effort to revolutionize understanding of childhood malignancies and lay the foundation for new treatments to cure or prevent cancer, which remains the leading cause of death by disease in American children and adolescents.
The human genome is the complete set of instructions for guiding an individual's development and continuing function. Those instructions are encoded in the approximately 3.1 billion bases of DNA, which are arranged into the genes and the chromosomes found in almost every cell. The genome project takes advantage of next-generation sequencing technology, which reduced the cost and time needed to determine the order of the four chemical bases that make each person's DNA unique. If that order is disrupted, cancer can result.
Next-generation sequencing technology breaks the long, double-stranded DNA molecule into millions of smaller fragments, which are each copied about 30 times. Using a reference human genome as a template, those segments are then reassembled according to rules that govern interaction of DNA's four chemical bases; adenine, thymine, cytosine and guanine. Those rules dictate adenine pairs only with thymine and cytosine only with guanine. For this project, investigators are interested in where a patient's normal and cancer genomes differ. Researchers believe those differences include cancer's origins.
Zhang and her colleagues began work on CREST when they manually detected a chromosomal rearrangement involving a known cancer gene that existing analytic tools failed to detect.
In developing CREST, researchers turned to pieces of DNA known as soft clips. These are the DNA segments produced during sequencing that fail to properly align to the reference human genome as the patient's genome is reassembled.
"Portions of the soft clip align nicely, but other portions just do not go together with the reference human genome," Zhang said, noting that although soft clips can be caused by chromosomal rearrangement, they have many causes and sometimes signal problems in sequencing data. Other analytic methods discard soft clips. But in developing CREST, researchers used the soft clips to precisely identify sites of chromosomal rearrangement or where pieces of DNA are inserted or deleted.
"CREST marks the first use of soft clips to identify fusion proteins," Zhang said, referring to hybrid proteins made when genomic rearrangements fuse pieces of two genes. The resulting proteins can disrupt normal cellular controls and lead to the unchecked cell division that marks cancer.
Using CREST, researchers found 110 structural variations in the five T-ALL genomes, including 89 that scientists validated using other laboratory methods. The results were better than the percentage found using other analytic tools.
When researchers used CREST to search for structural variations in the published whole-genome sequence of melanoma cells, they found 50 previously unidentified variations. Researchers went on to validate 18 of the 20 newly found variations selected for confirmatory testing.
"With the incorporation of CREST, we now can augment the existing approaches we have developed at Washington University to better detect and analyze important structural variants in human cancers," said co-author Li Ding, Ph.D., a geneticist and assistant director of medical genomics at Washington University's Genome Institute.