학술논문

Novel analytical methods to interpret large sequencing data from small sample sizes
Document Type
article
Source
Human Genomics, Vol 13, Iss 1, Pp 1-11 (2019)
Subject
Chronic myeloid leukemia
Next-generation sequencing
Pharmacogenetics
Small sample size
Statistics
Factorial correspondence analysis
Medicine
Genetics
QH426-470
Language
English
ISSN
1479-7364
Abstract
Abstract Background Targeted therapies have greatly improved cancer patient prognosis. For instance, chronic myeloid leukemia is now well treated with imatinib, a tyrosine kinase inhibitor. Around 80% of the patients reach complete remission. However, despite its great efficiency, some patients are resistant to the drug. This heterogeneity in the response might be associated with pharmacokinetic parameters, varying between individuals because of genetic variants. To assess this issue, next-generation sequencing of large panels of genes can be performed from patient samples. However, the common problem in pharmacogenetic studies is the availability of samples, often limited. In the end, large sequencing data are obtained from small sample sizes; therefore, classical statistical analyses cannot be applied to identify interesting targets. To overcome this concern, here, we described original and underused statistical methods to analyze large sequencing data from a restricted number of samples. Results To evaluate the relevance of our method, 48 genes involved in pharmacokinetics were sequenced by next-generation sequencing from 24 chronic myeloid leukemia patients, either sensitive or resistant to imatinib treatment. Using a graphical representation, from 708 identified polymorphisms, a reduced list of 115 candidates was obtained. Then, by analyzing each gene and the distribution of variant alleles, several candidates were highlighted such as UGT1A9, PTPN22, and ERCC5. These genes were already associated with the transport, the metabolism, and even the sensitivity to imatinib in previous studies. Conclusions These relevant tests are great alternatives to inferential statistics not applicable to next-generation sequencing experiments performed on small sample sizes. These approaches permit to reduce the number of targets and find good candidates for further treatment sensitivity studies.