학술논문

Poster: A Parallel Framework for Ab Initio Transcript-Clustering
Document Type
Conference
Author
Source
2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion) ICSE-COMPANION Software Engineering: Companion (ICSE-Companion), 2018 IEEE/ACM 40th International Conference on. :331-332 May, 2018
Subject
Computing and Processing
Clustering algorithms
Software
Software algorithms
Bioinformatics
Data structures
Pipelines
Filtering
Clustering
Expressed Sequence Tag (EST)
Object-Oriented Design Patterns
distributed caches
Minimum Spanning Tree (MST)
MPI
parallel framework
Language
ISSN
2574-1934
Abstract
Clustering is used to partition genomic data into disjoint subsets to streamline further processing. Since inputs can contain billions of nucleotides, performance is paramount. Consequently, clustering software is typically developed as a tightly coupled monolithic system which hinders software reusability, extensibility and introduction of new algorithms as well as data structures. Having experienced similar issues in our own clustering software, we have developed a flexible and extensible parallel framework called PEACE. The objective of the framework is to ease design, implementation, and use of various clustering methods without compromising performance. This paper presents the PEACE framework, its software architecture, parallel infrastructure, and distributed data structures along with a case study of developing a clustering algorithm. Case studies of developing filters, heuristics, and comparison algorithms are also discussed to illustrate modularity and extensibility of PEACE which enables software reuse in unique ways that may not have been foreseen when individual components were developed.