Xbft: byzantine fault tolerance with high performance, low cost, and aggressive fault isolation

Xbft: byzantine fault tolerance with high performance, low cost, and aggressive fault isolation(2008)

引用 4|浏览304
暂无评分
摘要
We are increasingly relying on online services to store, access, share, and disseminate critical information from anywhere and at all times. Such services include email, digital storage, photos, video, health and financial services, etc. With increasing evidence of non-fail-stop failures in practical systems, Byzantine fault tolerant state machine replication technique is becoming increasingly attractive for building highly-reliable services in order to tolerate such failures. However, existing Byzantine fault tolerant techniques fall short of providing high availability, high performance, and long-term data durability guarantees with competitive replication cost.In this dissertation, we present BFT replication techniques that facilitate the design and implementation of such highly-reliable services by providing high availability, high performance and high durability with competitive replication cost (hardware, software, network, management).First, we propose CBASE, a BFT state machine replication architecture that leverages application-level parallelism to improve throughput of the replicated system by identifying and executing independent requests concurrently. Traditional state machine replication based Byzantine fault tolerant (BFT) techniques provide high availability and security but fail to provide high throughput. This limitation stems from the fundamental assumption of generalized state machine replication techniques that all replicas execute requests sequentially in the same total order to ensure consistency across replicas. Our architecture thus provides a general way to exploit application parallelism in order to provide high throughput without compromising correctness.Second, we present Zyzzyva, an efficient BFT agreement protocol that uses speculation to significantly reduce the performance overhead and replication cost of BFT state machine replication. In Zyzzyva, replicas respond to a client’s request without first running an expensive three-phase commit protocol to reach agreement on the order in which the request must be processed. Instead, they optimistically adopt the order proposed by the primary and respond immediately to the client. Replicas can thus become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima.Third, we design and implement SafeStore, a distributed storage system designed to maintain long-term data durability despite conventional hardware and software faults, environmental disruptions, and administrative failures caused by human error or malice. The architecture of SafeStore is based on fault isolation, which SafeStore applies aggressively along administrative, physical, and temporal dimensions by spreading data across autonomous storage service providers (SSPs). SafeStore also performs an efficient end-to-end audit of SSPs to detect data loss quickly and improve data durability by reducing MTTR. SafeStore offers durable storage with cost, performance, and availability competitive with traditional storage systems.We evaluate these techniques by implementing BFT replication libraries and further demonstrate the practicality of these approaches by implementing an NFS based replicated file system(CBASE-FS) and a durable storage system (SafeStore-FS).
更多
查看译文
关键词
BFT state machine replication,total order,high throughput,BFT replication technique,byzantine fault tolerance,BFT replication library,low cost,generalized state machine replication,high performance,machine replication technique,aggressive fault isolation,competitive replication cost,high availability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要