학술논문

A comprehensive imputation-based evaluation of tag SNP selection strategies
Document Type
Conference
Source
2021 13th International Conference on Knowledge and Systems Engineering (KSE) Knowledge and Systems Engineering (KSE), 2021 13th International Conference on. :1-6 Nov, 2021
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Measurement
Knowledge engineering
Couplings
Sequential analysis
Systematics
Pipelines
Genomics
Tag SNP selection
SNP array design
genotyping imputation
linkage disequilibrium
Language
ISSN
2694-4804
Abstract
Regardless of the rapid development of sequencing technology, single nucleotide polymorphism (SNP) array has been widely used for many large-scale genomic studies due to its cost-effectiveness. Recently, in parallel with the advancement in imputation strategies, several genotyping platforms for various species have been developed. Despite the importance of imputation accuracy in SNP array design, to the best of our knowledge, there are no systematic studies for evaluating tag SNP selection methods based on this metric. In this paper, using the leave-one-out cross-validation approach on the 1000 genome high-coverage dataset, we comprehensively evaluated four well-known tag SNP selection algorithms based on imputation accuracy. Our results showed that although all widely used methods for SNP array design can provide reasonable imputation accuracy, pairwise linkage disequilibrium based tag SNP selection algorithm achieves the best performance. Our pipelines for running evaluated algorithms and leave-one-out cross-validation are available for public use at https://github.com/datngu/TagSNP_evaluation.