학술논문

The Case of a Novel Operational Distributed Storage Service for Big Data in a Semiconductor Wafer Fabrication Foundry
Document Type
Conference
Source
2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS) Parallel and Distributed Systems (ICPADS), 2018 IEEE 24th International Conference on. :1028-1033 Dec, 2018
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Servers
Distributed databases
Conferences
Publishing
Big Data
Cluster computing
Protocols
Hadoop
Big data storage
Distributed file systems
Web service computing
Middleware
Language
Abstract
We present in this paper a novel infrastructural service based on Hadoop for big data storage and computing in a Taiwan's semiconductor wafer fabrication foundry. The service is named Hadoop data service (HDS), which has been built and operated in production systems for 3.5 years. It evolves over time by incrementally accommodating users' requirements. HDS is a web-based distributed big data storage facility. Users simply rely on HDS to access data objects stored in Hadoop with the HTTP protocol. In addition, HDS is scalable and reliable. Moreover, HDS is efficient and effective by intelligently selecting either Hadoop distributed file system (HDFS) or database (HBase) for publishing data objects. Specifically, HDS is transparent to existing analytics and data inquiry applications, such as Spark and Hive. This paper discusses the design and implementation features for HDS. The performance metrics of HDS are also demonstrated.