학술논문

Derecho : Fast State Machine Replication for Cloud Services
Document Type
Academic Journal
Source
ACM Transactions on Computer Systems (TOCS). 36(2):1-49
Subject
Cloud computing
RDMA
consistency
non-volatile memory
replication
Language
English
ISSN
0734-2071
1557-7333
Abstract
Cloud computing services often replicate data and may require ways to coordinate distributed actions. Here we present Derecho, a library for such tasks. The API provides interfaces for structuring applications into patterns of subgroups and shards, supports state machine replication within them, and includes mechanisms that assist in restart after failures. Running over 100Gbps RDMA, Derecho can send millions of events per second in each subgroup or shard and throughput peaks at 16GB/s, substantially outperforming prior solutions. Configured to run purely on TCP, Derecho is still substantially faster than comparable widely used, highly-tuned, standard tools. The key insight is that on modern hardware (including non-RDMA networks), data-intensive protocols should be built from non-blocking data-flow components.