학술논문

TahcoRoll: fast genomic signature profiling via thinned automaton and rolling hash
Document Type
article
Source
Medical Review, Vol 1, Iss 2, Pp 114-125 (2021)
Subject
aho–corasick algorithm
genome sequencing
k-mers
multiple pattern matching
rolling hash
Medicine
Language
English
ISSN
2749-9642
79686664
Abstract
Genomic signatures like k-mers have become one of the most prominent approaches to describe genomic data. As a result, myriad real-world applications, such as the construction of de Bruijn graphs in genome assembly, have been benefited by recognizing genomic signatures. In other words, an efficient approach of genomic signature profiling is an essential need for tackling high-throughput sequencing reads. However, most of the existing approaches only recognize fixed-size k-mers while many research studies have shown the importance of considering variable-length k-mers.