Discovery of cancer driver genes for precision medicine in childhood cancers

Mona Nourbakhsh

Research output: Book/ReportPh.D. thesis

Abstract

Cancer is a complex and heterogeneous disease, affecting millions of people worldwide. This disease comprises around 200 types, each with multiple subtypes. Cancer arises due to dynamic changes in the genome caused by various (epi)genetic alterations such as mutations or abnormal DNA methylation. These alterations occur in genes called driver genes as they drive cancer development when altered. With the advent of next-generation sequencing technologies, a wealth of cancer sequencing data is now available, necessitating analysis and interpretation to gain further insights into cancer biology. While data quantity formerly presented a constraint, today’s bottleneck resides in the speed of analyzing and interpreting the substantial volume of available data. Here, bioinformatics plays a pivotal role as a key link between cancer data analysis and clinical decision-making.
Analyzing sequencing data has unveiled molecular markers in cancer. For example, the discovery of the frequently mutated BRAF oncogene catalyzed the development of BRAF  inhibitors, now routinely used in clinical settings. Besides development of targeted therapies, predicting cancer aberrations are crucial for patient stratification which allows for cancer subtyping, improved risk classification, prognosis prediction, and designing effective treatment strategies with a precision medicine approach. These cases illustrate the potential of cancer marker discovery for specific subgroups, generating useful knowledge that ultimately can improve positive patient outcomes.
Thus, this PhD thesis aims at advancing our molecular and biological knowledge of cancer development with a focus on cancer subtypes through prediction of novel driver genes and markers. To this end, this thesis analyzes cancer -omics data using bioinformatic approaches. Four manuscripts are included to address this. Manuscript I reviews and provides an overview of 74 state-of-the-art bioinformatic tools including driver gene prediction tools. Categorizing these driver gene prediction tools based on their underlying computational method resulted in four main groups with each group demonstrating unique strengths and drawbacks: interaction network construction, multi-omics data integration, machine learning, and mutational information. We found few tools distinguishing between classes of driver genes, i.e. tumor suppressor genes and oncogenes, but instead predict driver genes altogether. As research fields sometimes tend to operate in isolation, valuable synergies or breakthroughs are potentially missed, and thus, we highlighted the potential of integrative frameworks as recommendations for future directions. Manuscripts II-III center around one such integrative framework for driver gene discovery called Moonlight. Manuscript II introduces a novel functionality into Moonlight which provides mutational evidence of deregulated expression profiles of driver genes in terms of driver mutations. By applying this new functionality to three cancer (sub)types (basal-like breast cancer, lung adenocarcinoma, and thyroid carcinoma), we discovered 278, 131, and 12 driver genes, respectively. Furthermore, we found 13 and two driver genes in basal-like breast cancer and lung adenocarcinoma, respectively, with a driver mutation in their promoter region, potentially explaining their deregulation. Similarly, Manuscript III implements another new feature in Moonlight which uses abnormal DNA methylation patterns as evidence of deregulation of driver genes. This study resulted in 33, 190, and 263 methylation-driven genes in basal-like breast cancer, lung adenocarcinoma, and thyroid carcinoma, respectively. Furthermore, we found 20 prognostic oncogenes in lung adenocarcinoma and two prognostic oncogenes in thyroid carcinoma, highlighting a greater prognostic effect of oncogenes than tumor suppressor genes. We also found 7, 24, and 23 driver genes in basal-like breast cancer, lung adenocarcinoma, and thyroid carcinoma, respectively, previously annotated as cancer drug targets, highlighting their therapeutic potential. Before these two advancements of Moonlight presented in Manuscripts II-III, incorporation of (epi)genetic evidence resided with the user, lacking a defined protocol. Finally, Manuscript IV uses a data-driven approach including various biostatistical and machine learning methods to discover gene expression markers differentiating two acute lymphoblastic leukemia (ALL) subtypes, B-cell precursor ALL and T-cell ALL. We found 14 markers separating these two subtypes where the expression level of all 14 markers further had significant effects on survival of the patients, indicating a worse prognosis for patients with B-cell precursor ALL. Expression of six of these 14 markers could perfectly differentiate between B-cell precursor ALL and T-cell ALL in an independent cohort of pediatric patients with ALL. Finally, this study stratifies the patients into additional subgroups beyond B-cell precursor ALL and T-cell ALL based on expression profiles, resulting in four clusters with eight genes distinguishing two of these clusters.
In summary, this PhD delves into the complexity of cancer seeking to deepen our understanding of carcinogenesis through a systems biology approach to -omics data analysis. This thesis demonstrates that -omics analyses offer novel perspectives of the intricate landscape of cancer. While the results necessitate additional validation through more data and experiments, the manuscripts provide novel insights into the evolving field of cancer systems biology.
Original languageEnglish
PublisherDTU Health Technology
Number of pages384
Publication statusPublished - 2024

Fingerprint

Dive into the research topics of 'Discovery of cancer driver genes for precision medicine in childhood cancers'. Together they form a unique fingerprint.

Cite this