학술논문

Improved sequence mapping using a complete reference genome and lift-over
Document Type
Original Paper
Source
Nature Methods: Techniques for life scientists and chemists. 21(1):41-49
Subject
Language
English
ISSN
1548-7091
1548-7105
Abstract
Complete, telomere-to-telomere (T2T) genome assemblies promise improved analyses and the discovery of new variants, but many essential genomic resources remain associated with older reference genomes. Thus, there is a need to translate genomic features and read alignments between references. Here we describe a method called levioSAM2 that performs fast and accurate lift-over between assemblies using a whole-genome map. In addition to enabling the use of several references, we demonstrate that aligning reads to a high-quality reference (for example, T2T-CHM13) and lifting to an older reference (for example, Genome reference Consortium (GRC)h38) improves the accuracy of the resulting variant calls on the old reference. By leveraging the quality improvements of T2T-CHM13, levioSAM2 reduces small and structural variant calling errors compared with GRC-based mapping using real short- and long-read datasets. Performance is especially improved for a set of complex medically relevant genes, where the GRC references are lower quality.
By combining fast lift-over and selective re-mapping, levioSAM2 enables efficient and accurate read mapping and variant calling leveraging complete reference genomes.