KOR

e-Article

mitoSomatic: a tool for accurate identification of mitochondrial DNA somatic mutations without paired controls
Document Type
Academic Journal
Source
Molecular Oncology. May, 2023, Vol. 17 Issue 5, p857, 871 p.
Subject
Machine learning
Mitochondrial DNA
Language
English
ISSN
1574-7891
Abstract
Mitochondrial DNA (mtDNA) somatic mutations play important roles in the initiation and progression of cancer. Although next‐generation sequencing (NGS) of paired tumor and control samples has become a common practice to identify tumor‐specific mtDNA mutations, the unique nature of mtDNA and NGS‐associated sequencing bias could cause false‐positive/‐negative somatic mutation calling. Additionally, there are clinical scenarios where matched control tissues are unavailable for comparison. Therefore, a novel approach for accurately identifying somatic mtDNA variants is greatly needed, particularly in the absence of matched controls. In this study, the ground truth mtDNA variants orthogonally validated by triple‐paired tumor, adjacent nontumor, and blood samples were used to develop mitoSomatic, a random forest‐based machine learning tool. We demonstrated that mitoSomatic achieved area under the curve (AUC) values over 0.99 for identifying somatic mtDNA variants without paired control in three tumor types. In addition, mitoSomatic was also applicable in nontumor tissues such as adjacent nontumor and blood samples, suggesting the flexibility of mitoSomatic's classification capability. Furthermore, analysis of triple‐paired samples identified a small group of variants with uncertain somatic/germline origin, whereas application of mitoSomatic significantly facilitated the prediction of their possible source. Finally, a control‐free evaluation of the public pan‐cancer NGS dataset with mitoSomatic revealed a substantial number of variants that were probably misclassified by conventional tumor‐control comparison, further emphasizing the usefulness of mitoSomatic in application. Taken together, our study demonstrates that mitoSomatic is valuable for accurately identifying somatic mtDNA variants in mtDNA NGS data without paired controls, applicable for both tumor and nontumor tissues.
Abbreviations Introduction Mitochondria contain their unique genome, the mitochondrial DNA (mtDNA), which in humans is a genetically compact, double‐stranded, circular molecular of 16.5 kb, encoding 2 rRNAs, 22 tRNAs, and [...]