Publication
Article
Oncology Live®
Author(s):
Even as technological advances make it possible to sequence DNA on a large scale at relatively lower costs and in shorter time-frames, emerging evidence from the world's top research laboratories suggests that scientists are still a long way from having a complete catalog of cancer genes.
Even as technological advances make it possible to sequence DNA on a large scale at relatively lower costs and in shorter time-frames, emerging evidence from the world’s top research laboratories suggests that scientists are still a long way from having a complete catalog of cancer genes.
And, as a wealth of data accumulates, it is becoming increasingly clear that the sheer variety and frequency of mutations plays a complex role in malignancies, underscoring the challenges of molecularly targeted anticancer strategies.
Recent research from the Broad Institute of MIT and Harvard highlights both of these developments. Researchers announced the discovery of 33 genes whose roles in promoting cancers had not been previously identified and described how the burden of mutations differs among tumor types.
The implications for therapies include the need for novel analytic techniques and fresh approaches to multigene targeting.
The renowned German biologist Theodor Boveri suspected the genetic origins of cancer more than a century ago, but it took almost 70 years for his suspicions to be confirmed. Now it is well understood that the majority of cancers arise as a result of the acquisition of a series of alterations to their genome, which lead to the dysregulation of key cellular processes.
During the past several decades, significant headway has been made in uncovering the genetic alterations that play a role in the development and progression of cancer and in identifying scores of oncogenes, which promote oncogenesis, and tumor suppressors, which keep it in check. Researchers have identified both driver mutations that are causally implicated in oncogenic development and passenger mutations that, as the name suggests, are just along for the ride and are likely a result of the genetic disarray of cancer, rather than the cause of it (Figure).
References
1An O et al [published online March 7, 2014]. Database. doi:10.1093/database/bau015.
2Vogelstein B et al. Science. 2013;339(6127):1546-1558.
All of the nearly 140 driver mutations identified to date are linked to one or more cellular pathways or processes, the realization of which led to the idea of cancer “hallmarks”—that is, essential changes that normal cells undergo on the path to becoming malignant tumor cells. While the contributions of many of these hallmark processes to cancer seemed intuitive, genomic sequencing has revealed new, unexpected contributions of cellular processes governing epigenetic regulation, chromatin modification, metabolism, and others that were not previously suspected to be involved in cancer.
Candidate cancer genes are discovered by sequencing tumor samples and comparing the findings with matched normal samples, based on the presence of a greater number of somatic mutations than expected.
The process was initially time-consuming and expensive, but advancements in sequencing technology in the past decade, particularly the introduction of massively parallel sequencing, have brought about a 1 million—fold decrease in the cost and increased the capacity of DNA sequencing, such that it is now possible to sequence a genome for $1000 and to sequence more than 600 gigabases (Gb) per run.
This graphic depicts the median frequencies of somatic mutations in the exome (horizontal lines) across multiple tumor types from the lowest frequencies, left, to the highest frequencies, right, as measured in mutations per megabase (Mb). The dots represent tumor-normal pairs. The colored panel at bottom illustrates the distribution of 6 possible base-pair substitutions in key at left.
Broad Institute of MIT and Harvard. Reprinted with permission.
Major international efforts are currently under way to catalog the cancer genome, including those by The Cancer Genome Atlas and the International Cancer Genome Consortium, and there is a growing interest in pan-cancer analyses. Promising results from many of these studies were reported in 2013. Currently, roughly 10% of the known genes in the human genome have been identified as or are suspected to be cancer genes. But an important question remains: How far are we from a complete picture?
A recent study published in Nature found that, while we are still a long way from our goal, it should ultimately be achievable with current technology if we can overcome some hurdles first. Michael S. Lawrence, PhD, a computational biologist at the Broad Institute in Boston, Massachusetts, and colleagues performed a large-scale genomic analysis using exome sequences from almost 5000 human cancers and their matched normal tissue pairs across 21 tumor types.
The researchers analyzed somatic mutation rates within these samples and used a technique called down-sampling to examine how the number of cancer genes identified changes with sample size. This enabled them to determine whether we are close to having identified all genes involved in cancer.
What they found was that the number of genes increased in a roughly linear fashion as sample size and the number of tumor types analyzed increased. Since the discovery of novel cancer genes is still on an upward curve, this suggests that many cancer genes remain to be discovered.
The team then used the data they had collected to provide an estimate of what kind of sample size might get us close to a complete record of cancer genes and estimated that we would need to analyze 100,000 samples across roughly 50 tumor types. While this may seem a daunting task, in theory it should be easily achievable given the many millions of people living with cancer worldwide.
In fact, this study in itself is a kind of proof of concept; the research team was able to determine that large-scale genomic analyses like the one they performed could identify all 138 currently known driver genes. Furthermore, using novel analytical techniques to overcome some of the current issues with sequencing large numbers of genes, they identified 33 new cancer genes, 21 of which demonstrated strong and consistent functional links to cancer hallmark processes (Table). This study effectively expanded current knowledge of cancer genes by 25%.
Although sequencing such large numbers of samples may be feasible with current technology, it isn’t without hurdles. Most significantly, as sample size increases, the screen for new cancer genes becomes less accurate and the number of genes that are falsely linked to cancer development increases.
Another issue that the Broad Institute study and several other high-profile studies that have emerged in the past few years are beginning to uncover is the significant complexity in the mutational origins of cancer. Multiple genetic mutations contribute to cancer, global and individual gene mutation rates are highly variable, and the specific genes that cause tumors differ both across and within different tumor types. This mutational heterogeneity can also contribute significantly to the generation of false positives in sequencing studies.
On an individual level, a minority of cancer genes are mutated at high frequencies (>20%) across different cancer types; it is these genes that have proved more amenable to detection. As such, it is thought that the identification of novel, highly mutated genes is likely to be slowing down. However, the vast majority of cancer genes are mutated at much lower frequencies (2%-20% or less) and it is in this category that we have significant gaps in our current knowledge.
Lawrence et al confirmed that the discovery of genes mutated at a rate of >20% is reaching saturation, while the discovery rate among genes mutated at 10% to 20% is still increasing, but at a decreased rate. At a mutational frequency of 5% to 10%, the number of newly identified genes is increasing linearly, and at <5% it is increasing at an accelerating rate. Because the majority of genes fall in these lower frequency categories, identifying them is especially important since they could provide therapeutic options that are effective in the majority of patients, rather than in specific subsets of patients as with many current targeted therapies.
Mutation rates also vary substantially on a global scale within different tumors, known as mutational load, ranging from 0.1 mutation/megabase (Mb) in some pediatric tumors to around 100 mutations/Mb in lung cancer and melanoma. One recent study found that the median frequency of mutations varied by more than 1000-fold across different cancer types. Tumors with higher mutational loads pose a significant challenge to genome sequencing studies.
The majority of cancers have between 2 and 8 driver mutations; the remaining >99.9% of mutations are irrelevant passenger mutations. A higher mutational load makes it more difficult to accurately distinguish the drivers from the passengers.
Researchers have begun to postulate reasons for the variation in mutation rates. In part, it may be explained by variations in the timing of the occurrence of driver mutations. In some tumor types, particularly those that arise from self-renewing tissues in which frequent cell division offers ample opportunity for mutations to occur, a significant number of mutations may have accumulated prior to the cancer-initiating driver mutation event.
It may also be partly explained by biological factors, such as whether the cancer is associated with a mutagen, which would induce many more mutations. The latter would explain why lung cancer, which tends to be associated with cigarette smoking; and melanoma, which is associated with ultraviolet exposure, are at the higher end of the mutational load scale.
Research teams like the one at the Broad Institute are trying to develop methods to help overcome the issue of mutational variation in genome sequencing studies, to give them the ability to accurately identify driver mutations among large numbers of genes.
Lawrence et al developed a method called MutSig, which they used in their study. This method corrects for mutational variation by performing three separate tests: a mutational burden test, a mutational clustering test, and a mutational functional impact test. In addition to looking for high mutational burden relative to expectation, the system looks for clustering of mutations within a candidate gene, and whether mutations within that gene are enriched at functional sites. The significance levels from these three tests are combined to obtain a single significance level for each gene.
Some of the potential clinical implications of variable driver mutation rates were highlighted by George W. Sledge Jr, MD, professor of Oncology at the Stanford University Medical Center, at the 30th Annual Miami Breast Cancer Conference in Miami Beach, Florida, last year. Cancers with few driver mutations, which he dubbed “stupid” cancers, generally tend to be easier to treat as they respond to single-agent therapy. In contrast, Sledge said “smart cancers” have many driver mutations, such as triplenegative breast cancer or lung cancer, and tend to be highly resistant to treatment.
Sledge and others believe that an overhaul of the clinical development of molecularly targeted agents is needed to fully capitalize on the discoveries of novel cancer genes and to target smarter cancers more effectively.
According to Sledge, among the issues the current clinical development system faces are that it emphasizes single agents, combination trials are not biomarker-based, and biomarker development is not a priority. He proposed that trial design should focus on multitargeting, with increased collaboration to foster combination therapy, that bioinformatics should be provided in real-time, and that an information technology network should be used to support clinical trials and cancer care. Furthermore, he believes a fundamentally different regulatory apparatus may be required. Only then might we achieve truly personalized therapy in the genomic era.
Jane de Lartigue, PhD, is a freelance medical writer and editor based in Davis, California.
Key Research
An O, Pendino V, D’Antonio M, et al. NCG 4.0: the network of cancer genes in the era of massive mutational screenings of cancer genomes [published online March 7, 2014]. Database. 2014: article ID bau015. doi:10.1093/database/bau015.
Garraway LA. Genomics-driven oncology: framework for an emerging paradigm [published online April 15, 2013]. J Clin Oncol. 2013:31(15):1806-1814.
Garraway LA, Lander ES. Lessons from the cancer genome. Cell. 2013;153(1):17-37.
Kandoth C, McLellan MD, Vandin F, et al. Mutational landscape and significance across 12 major cancer types. Nature. 2013;502(7471):333-339.
Lawrence MS, Stojanov P, Mermel CH, et al. Discovery and saturation analysis of cancer genes across 21 tumor types [published online January 5, 2014]. Nature. 2014;505(7484):495-501.
Lawrence MS, Stojanov P, Polak P, et al. Mutational heterogeneity in cancer and the search for new cancer genes [published online June 16, 2013]. Nature. 2013;499(7457):214-218.
Stephens PJ, Tarpey PS, Davies H, et al. The landscape of cancer genes and mutational processes in breast cancer. Nature. 2012;486(7403):400-404.
Vogelstein B, Papadopoulos N, Velculescu VE, et al. Cancer genome landscapes. Science. 2013;339(6127):1546-1558.