학술논문

Image processing on multinode hadoop cluster
Document Type
Conference
Source
2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT) Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), 2017 International Conference on. :21-26 Dec, 2017
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Engineering Profession
Fields, Waves and Electromagnetics
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Image processing
Distributed databases
Big Data
Ecosystems
Computer science
Tools
Internet
Hadoop
Hadoop Distributed File System (HDFS)
Hadoop Image Processing Interface (HIPI)
MapReduce
Language
Abstract
In the past few years the data produced all over the internet has increased with an exponential rate. The storage costs have been rising immensely. But, in the field of Computer Science, the introduction of new technologies has resulted in reduction of storage costs. This led to a rampant rise in the data generation rates. This huge amount of data that is so vast and complex such that classical methods are insufficient for processing is termed as ‘Big Data’. There are various tools in Hadoop to analyze the textual data such as Pig, HBase, etc. But the data present on the Internet as well as the Social Networking sites comprises of unstructured media. The maximum spectrum of the media files is covered by Image data. The major concern is not about the storage of images, but the processing of images being generated with the speed of light. Every day around 350 million pictures are being uploaded on social network. Until now, over 200 billion photos have been uploaded only on Facebook. This accounts to an average of around 200 photos per user. This whole amount of data generated round the globe can be classified into three formats-Structured, Semi-Structured, and Unstructured. The Image data contains not only the pictures but also the data defining those pictures such as the resolution, source of the image, capture device, etc. To fetch all this information in a structured format HIPI (Hadoop Image Processing Interface) tools are used. In this paper image processing is performed on a MultiNode Hadoop Cluster and its performance is measured.