e-Article

ObjDedup: High-Throughput Object Storage Layer for Backup Systems With Block-Level Deduplication

Document Type

Periodical

Author

Jackowski, A.; Slusarczyk, L.; Lichota, K.; Welnicki, M.; Wijata, R.; Kielar, M.; Kopec, T.; Dubnicki, C.; Iwanicki, K.

Source

IEEE Transactions on Parallel and Distributed Systems IEEE Trans. Parallel Distrib. Syst. Parallel and Distributed Systems, IEEE Transactions on. 34(7):2180-2197 Jul, 2023

Subject

Computing and Processing
Communication, Networking and Broadcast Technologies
Metadata
Engines
Throughput
Object recognition
Cloud computing
Aerospace electronics
Quality of service
Backup storage
deduplication
object storage
secondary storage

Language

ISSN

1045-9219
1558-2183
2161-9883

Abstract

The immense popularity of object storage is also affecting the market of backup. Not only have novel backup solutions emerged that utilize cloud-based object storage as backends, but also support for object storage interfaces is increasingly expected from traditional dedicated backup appliances. This latter trend especially concerns systems with data deduplication, as they can offer compelling gains in storage capacity and throughput. However, such systems have been designed for interfaces and workloads that are markedly different from those encountered in object storage. Notably, they expect data to be written in portions that are orders of magnitude longer than those in the novel object-storage-oriented backup applications. In this light, we contribute twofold. First, contrasting the properties of object storage interfaces with usage patterns from 686 commercial deployments of backup appliances, we identify specific issues an implementation of such an interface has to address to offer adequate performance in a backup system with block-level deduplication. In particular, we show that a major challenge is efficient metadata management. Second, we present distributed data structures and algorithms to handle object metadata in backup systems with block-level deduplication. Subsequently, we implement them as an object storage layer for our HYDRAstor backup system. In comparison to object storage without in-line deduplication, our solution achieves 1.8–3.93x higher write throughput. Compared to object storage on top of a state-of-the-art file-based backup system, it processes 5.26–11.34x more object put operations per time unit.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

Send an e-mail