학술논문

Interactive Analytic DBMSs: Breaching the Scalability Wall

Document Type

Conference

Author

Pedreira, Pedro; Dutta, Amit; Pershin, Sergey; Liu, Lin; Shringarpure, Sushant; Tan, Jialiang; Landers, Brian; Gao, Ge; Pieper, Karen

Source

2021 IEEE 37th International Conference on Data Engineering (ICDE) ICDE Data Engineering (ICDE), 2021 IEEE 37th International Conference on. :432-443 Apr, 2021

Subject

Computing and Processing
Measurement
Fault tolerance
Social networking (online)
Scalability
Fault tolerant systems
Production
Load management
Interactive DBMS
analytics
sharding

Language

ISSN

2375-026X

Abstract

Analytic DBMSs optimized for query interactivity commonly push the computation down to storage nodes, thus avoiding large network transfers and keeping query execution wall-time to a minimum. In these systems, data is sharded and stored locally by cluster nodes, which must all participate in query execution. As the system scales-out, hardware failures and other non-deterministic sources of tail latency start to dominate, to a point where query latency and success ratio increasingly violate the system’s SLA. We refer to this tipping point as the system’s scalability wall, when sharding data between more nodes only worsens the problem.This paper describes how an analytic DBMS optimized for low-latency queries can breach the scalability wall by sharding different tables to different subsets of cluster nodes — a strategy we call partial sharding — and reduce the query fan-out. Because partial sharding requires the DBMS to implement many tedious and complex shard management tasks, such as shard mapping, load balancing and fault tolerance, this paper describes how a database system can leverage an external general-purpose shard management service for such tasks. We present a case study based on Cubrick, an in-memory analytic DBMS developed at Facebook, highlighting the integration points with a shard management framework called Shard Manager. Finally, we describe the many design decisions, pitfalls and lessons learned during this process, which eventually allowed Cubrick to scale to thousands of nodes.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송