학술논문

Optimization of system architecture for Big Data analysis in climate science
Document Type
Conference
Source
2015 IEEE International Conference on Big Data (Big Data) Big Data (Big Data), 2015 IEEE International Conference on. :2169-2172 Oct, 2015
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Data models
Computational modeling
Meteorology
Analytical models
Uncertainty
Data processing
Measurement
Language
Abstract
In this paper, we describe an emergent tool called DAWN (short for "Distributed Analytics, Workflows and Numeric") which is a model for simulating, analyzing and optimizing system architectures for executing arbitrary data processing pipelines. As an example, we will apply DAWN to the investigation of a real-life Big Data use case in climate science: the evaluation of simulated rainfall characteristics using high-resolution observational data. We will show how DAWN can help in determining the optimal architecture, and science algorithms, to execute this case study analyzing distributed datasets, as a tradeoff between the overall time cost and the uncertainty of calculated metrics for model evaluation. We will also show how DAWN can guide architectural decisions for future research, specifically impacting how data should be generated and analyzed to cope with future projected data volumes.