학술논문

Parallel Implementation of the Novel Approach to Genome Assembly
Document Type
Conference
Source
2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2008. SNPD '08. Ninth ACIS International Conference on. :732-737 Aug, 2008
Subject
Computing and Processing
Communication, Networking and Broadcast Technologies
Assembly
DNA
Genomics
Bioinformatics
Program processors
Microorganisms
Algorithm design and analysis
bioinformatics
DNA assembly
454 sequencing
parallel implementation
graphs
heuristics
Language
Abstract
DNA assembly problem is well known for its high complexity both on biological and computational levels. Traditional laboratory approach to the problem, which involves DNA sequencing by hybridization or by gel electrophoresis, entails a lot of errors coming from experimental and algorithmic stages. DNA sequences constituting the traditional assembly input have lengths about a few hundreds of nucleotides and they cover each other rather sparsely. A new biochemical approach to DNA sequencing, proposed recently, gives highly reliable output of relatively lowcost and in short time. It is 454 sequencing, based on the pyrosequencing protocol, owned by 454 Life Sciences Corporation. The produced sequences are shorter (about 100-200 nucleotides) but their coverage in the assembled sequence is very dense. In the paper, we proposea parallel implementation of an algorithm dealing well with such data and outperforming other assembly algorithms used in practice.The algorithm is a heuristic based on a graph model, the graph being built on the set of input sequences. Computational tests we reperformed on real data obtained from the 454 sequencer during sequencing the genome of bacteria Prochlorococcus marinus.