학술논문

SALIENT: Ultra-Fast FPGA-based Short Read Alignment
Document Type
Conference
Source
2022 International Conference on Field-Programmable Technology (ICFPT) Field-Programmable Technology (ICFPT), 2022 International Conference on. :1-10 Dec, 2022
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Sequential analysis
Precision medicine
Pipelines
Genomics
Throughput
Software
Hardware
Language
Abstract
State-of-the-art high-throughput DNA sequencers output terabytes of short reads that typically need to be aligned to a reference genome in order to perform downstream analyses. Because alignment typically dominates the total run time of bioinformatics pipelines, a number of recent work sought to accelerate it in hardware. However, existing FPGA implemen-tations did not fully optimize the alignment algorithms for the FPGA hardware and mainly focused on a subset of alignment problems, e.g., ungapped alignment with a limited number of mismatches, which hinder their practical utility. In this work, we analyze the existing alignment methods and identify and leverage opportunities for FPGA acceleration. Our alignment framework, SALIENT, first carries out an ultra-fast ungapped alignment, which supports a flexible number of mismatches. Based on the underlying bioinformatics pipeline and the information provided by the ungapped aligner, SALIENT then identifies a fraction of reads that need to go through its gapped aligner, thus improving alignment throughput. We extensively evaluate SALIENT using diverse datasets. Experimental results indicate that SALIENT, running on a single Xilinx Alveo U280 device, delivers an average throughput of 546 million bases/second, outperforming the state- of-the-art minimap2 software by 40x, and Bowtie2 by up to 107 x, with a similar or slightly better (~O.l %-0.5 %) alignment and error (false negative/positive) rate. Compared to the existing ungapped FPGA aligners [1]–[4], SALIENT has 9.4-18x higher throughput/Watt, while compared to the gapped aligners [5], [6], it is 28–35 x better. SALIENT achieves 7.6 x higher throughput than Illumina DRAGEN Bio-IT Platform [7].