학술논문

Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies
Document Type
article
Source
Genome Biology, Vol 21, Iss 1, Pp 1-27 (2020)
Subject
Genome assembly
Assembly validation
Benchmarking
K-mers
Haplotype phasing
Trio binning
Biology (General)
QH301-705.5
Genetics
QH426-470
Language
English
ISSN
1474-760X
Abstract
Abstract Recent long-read assemblies often exceed the quality and completeness of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations. By comparing k-mers in a de novo assembly to those found in unassembled high-accuracy reads, Merqury estimates base-level accuracy and completeness. For trios, Merqury can also evaluate haplotype-specific accuracy, completeness, phase block continuity, and switch errors. Multiple visualizations, such as k-mer spectrum plots, can be generated for evaluation. We demonstrate on both human and plant genomes that Merqury is a fast and robust method for assembly validation.