학술논문

Amdahl's Law in Big Data Analytics: Alive and Kicking in TPCx-BB (BigBench)
Document Type
Conference
Source
2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) HPCA High Performance Computer Architecture (HPCA), 2018 IEEE International Symposium on. :630-642 Feb, 2018
Subject
Computing and Processing
Benchmark testing
Big Data
Industries
Data analysis
Throughput
Usability
Parallel processing
big data analytics
Hadoop
BigBench
TPCx BB
Amdahl's law
core packing
Language
ISSN
2378-203X
Abstract
Big data, specifically data analytics, is responsible for driving many of consumers' most common online activities, including shopping, web searches, and interactions on social media. In this paper, we present the first (micro)architectural investigation of a new industry-standard, open source benchmark suite directed at big data analytics applications—TPCx-BB (BigBench). Where previous work has usually studied benchmarks which oversimplify big data analytics, our study of BigBench reveals that there is immense diversity among applications, owing to their varied data types, computational paradigms, and analyses. In our analysis, we also make an important discovery generally restricting processor performance in big data. Contrary to conventional wisdom that big data applications lend themselves naturally to parallelism, we discover that they lack sufficient thread-level parallelism (TLP) to fully utilize all cores. In other words, they are constrained by Amdahl's law. While TLP may be limited by various factors, ultimately we find that single-thread performance is as relevant in scale-out workloads as it is in more classical applications. To this end we present core packing: a software and hardware solution that could provide as much as 20% execution speedup for some big data analytics applications.