학술논문

CloneRetriever: An Automated Algorithm to Identify Clonal B and T Cell Gene Rearrangements by Next-Generation Sequencing for the Diagnosis of Lymphoid Malignancies
Document Type
article
Source
Clinical Chemistry. 67(11)
Subject
Medical Biochemistry and Metabolomics
Biomedical and Clinical Sciences
Clinical Sciences
Rare Diseases
Networking and Information Technology R&D (NITRD)
Human Genome
Hematology
Bioengineering
Genetics
Cancer
Detection
screening and diagnosis
4.1 Discovery and preclinical testing of markers and technologies
Algorithms
Gene Rearrangement
Gene Rearrangement
T-Lymphocyte
High-Throughput Nucleotide Sequencing
Humans
Neoplasm
Residual
bioinformatics
immunoglobulin sequencing
lymphoma diagnostics
Medical Biotechnology
General Clinical Medicine
Clinical sciences
Medical biochemistry and metabolomics
Language
Abstract
BackgroundClonal immunoglobulin and T-cell receptor rearrangements serve as tumor-specific markers that have become mainstays of the diagnosis and monitoring of lymphoid malignancy. Next-generation sequencing (NGS) techniques targeting these loci have been successfully applied to lymphoblastic leukemia and multiple myeloma for minimal residual disease detection. However, adoption of NGS for primary diagnosis remains limited.MethodsWe addressed the bioinformatics challenges associated with immune cell sequencing and clone detection by designing a novel web tool, CloneRetriever (CR), which uses machine-learning principles to generate clone classification schemes that are customizable, and can be applied to large datasets. CR has 2 applications-a "validation" mode to derive a clonality classifier, and a "live" mode to screen for clones by applying a validated and/or customized classifier. In this study, CR-generated multiple classifiers using 2 datasets comprising 106 annotated patient samples. A custom classifier was then applied to 36 unannotated samples.ResultsThe optimal classifier for clonality required clonal dominance ≥4.5× above background, read representation ≥8% of all reads, and technical replicate agreement. Depending on the dataset and analysis step, the optimal algorithm yielded sensitivities of 81%-90%, specificities of 97%-100%, areas under the curve of 91%-94%, positive predictive values of 92-100%, and negative predictive values of 88%-98%. Customization of the algorithms yielded 95%-100% concordance with gold-standard clonality determination, including rescue of indeterminate samples. Application to a set of unknowns showed concordance rates of 83%-96%.ConclusionsCR is an out-of-the-box ready and user-friendly software designed to identify clonal rearrangements in large NGS datasets for the diagnosis of lymphoid malignancies.