학술논문
A Provenance Approach to Trace Scientific Experiments on a Grid Infrastructure
Document Type
Conference
Author
Source
2011 IEEE Seventh International Conference on eScience E-Science (e-Science), 2011 IEEE 7th International Conference on. :134-141 Dec, 2011
Subject
Language
Abstract
Large experiments on distributed infrastructures become increasingly complex to manage, in particular to trace all computations that gave origin to a piece of data or an event such as an error. The work presented in this paper describes the design and implementation of an architecture to support experiment provenance and its deployment in the concrete case of a particular e-infrastructure for biosciences. The proposed solution consists of: (a) a data provenance repository to capture scientific experiments and their execution path, (b) a software tool (crawler) that gathers, classifies, links, and stores the information collected from various sources, and (c) a set of user interfaces through which the end-user can access the provenance data, interpret the results, and trace the sources of failure. The approach is based on an OPM-compliant API, PLIER, that is flexible to support future extensions and facilitates interoperability among heterogeneous application systems.