학술논문

Compressed Data Direct Computing for Databases

Document Type

Periodical

Author

Wan, W.; Zhang, F.; Zhang, C.; Zhang, M.; Zhai, J.; Chai, Y.; Zhang, H.; Lu, W.; Chen, Y.; Li, H.; Pan, A.; Du, X.

Source

IEEE Transactions on Knowledge and Data Engineering IEEE Trans. Knowl. Data Eng. Knowledge and Data Engineering, IEEE Transactions on. 36(5):1902-1918 May, 2024

Subject

Computing and Processing
Databases
Database systems
File systems
Data analysis
Memory management
Engines
Big Data applications
Compression
compressed data direct processing
database systems

Language

ISSN

1041-4347
1558-2191
2326-3865

Abstract

Directly performing operations on compressed data has been proven to be a big success facing Big Data problems in modern data management systems. These systems have demonstrated significant compression benefits and performance improvement for data analytics applications. However, current systems only focus on data queries, while a complete Big Data system must support both data query and data manipulation. To solve this problem, we develop CompressDB, which is a new storage engine that can support data processing for databases without decompression. CompressDB has the following advantages. First, CompressDB utilizes context-free grammar to compress data, and supports both data query and data manipulation. Second, for adaptability, we integrate CompressDB to file systems so that a wide range of databases can directly use CompressDB without any change. Third, we enable operation pushdown to storage so that we can perform data query and manipulation in storage systems without bringing large data to memory for high efficiency. We validate the efficacy of CompressDB supporting various kinds of database systems, including SQLite, MySQL, LevelDB, MongoDB, ClickHouse, and Neo4j. We evaluate our method using seven real-world datasets with various lengths, structures, and content in both single node and cluster environments. Experiments show that CompressDB achieves 40% throughput improvement and 44% latency reduction, along with 1.75 compression ratio on average.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송