학술논문

A Distributed-Processing System for Accelerating Biological Research Using Data-Staging

Document Type

Journal Article

Author

Hideo Matsuda; Shigeto Seno; Susumu Date; Yoichi Takenaka; Yoshiyuki Kido

Source

IPSJ Digital Courier. 2008, 4:250

Subject

Language

English

ISSN

1349-7456

Abstract

The number of biological databases has been increasing rapidly as a result of progress in biotechnology. As the amount and heterogeneity of biological data increase, it becomes more difficult to manage the data in a few centralized databases. Moreover, the number of sites storing these databases is getting larger, and the geographic distribution of these databases has become wider. In addition, biological research tends to require a large amount of computational resources, i.e., a large number of computing nodes. As such, the computational demand has been increasing with the rapid progress of biological research. Thus, the development of methods that enable computing nodes to use such widely-distributed database sites effectively is desired. In this paper, we propose a method for providing data from the database sites to computing nodes. Since it is difficult to decide which program runs on a node and which data are requested as their inputs in advance, we have introduced the notion of “data-staging” in the proposed method. Data-staging dynamically searches for the input data from the database sites and transfers the input data to the node where the program runs. We have developed a prototype system with data-staging using grid middleware. The effectiveness of the prototype system is demonstrated by measurement of the execution time of similarity search of several-hundred gene sequences against 527 prokaryotic genome data.

Online Access

Open Access (JSTAGE) Open Access (EBSCO) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송