학술논문

Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing
Document Type
article
Source
International Journal of Molecular Sciences. 22(19)
Subject
Biological Sciences
Bioinformatics and Computational Biology
Genetics
Human Genome
Genome
Plant
Polymorphism
Single Nucleotide
Polyploidy
Triticum
Exome Sequencing
wheat
SNPs
WES
variants
BCFtools
STAR
Bowtie2
BWA
Other Chemical Sciences
Other Biological Sciences
Chemical Physics
Biochemistry and cell biology
Microbiology
Medicinal and biomolecular chemistry
Language
Abstract
The highly challenging hexaploid wheat (Triticum aestivum) genome is becoming ever more accessible due to the continued development of multiple reference genomes, a factor which aids in the plight to better understand variation in important traits. Although the process of variant calling is relatively straightforward, selection of the best combination of the computational tools for read alignment and variant calling stages of the analysis and efficient filtering of the false variant calls are not always easy tasks. Previous studies have analyzed the impact of methods on the quality metrics in diploid organisms. Given that variant identification in wheat largely relies on accurate mining of exome data, there is a critical need to better understand how different methods affect the analysis of whole exome sequencing (WES) data in polyploid species. This study aims to address this by performing whole exome sequencing of 48 wheat cultivars and assessing the performance of various variant calling pipelines at their suggested settings. The results show that all the pipelines require filtering to eliminate false-positive calls. The high consensus among the reference SNPs called by the best-performing pipelines suggests that filtering provides accurate and reproducible results. This study also provides detailed comparisons for high sensitivity and precision at individual and population levels for the raw and filtered SNP calls.