Minimizing Network and Storage Costs for Consensus with Flexible Erasure Coding

PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023(2023)

引用 0|浏览12
暂无评分
摘要
Consensus protocols like Paxos and Raft provide data consistency and fault tolerance for upper-layer distributed services. Log replication in these protocols can be supported by erasure coding, which incurs a lower redundancy ratio than full-copy replication and hence significantly saves network and storage costs for overall performance improvements. However, existing consensus protocols with erasure coding cannot achieve the minimum network and storage costs during log replication. Our observation is that the optimal coding scheme varies with the number of healthy servers in a group, such that the coding scheme with the lowest redundancy ratio in normal cases incurs more network traffic and storage overhead for log replication in the presence of server failures. To this end, we propose FlexRaft, which dynamically adjusts the coding scheme used in Raft based on the server status to always achieve the theoretically minimum redundancy ratio, while maintaining the same liveness as in the original Raft. To address the issue of an inconsistent coding scheme between the leader and its followers, we specify the prerequisite of overwriting a log entry, and also allow the leader and its followers to exactly track the coding scheme used. We further consider how to handle server failures and prove the safety of FlexRaft. We implement a prototype of FlexRaft, atop which we build a distributed key-value store to show its efficacy. Experiments on Alibaba Cloud show that FlexRaft achieves the theoretically minimum network and storage costs in practice, and reduces the commit latency by 44.51% and 19.37% compared with state-of-the-art CRaft and HRaft, respectively.
更多
查看译文
关键词
Raft,Erasure coding,Key-value store
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要