On the Implementation of Unreliable Failure Detectors in Partially Synchronous Systems

IEEE Transactions on Computers（2004）

引用 73|浏览0

暂无评分

摘要

Unreliable failure detectors were proposed by Chandra and Toueg as mechanisms that provide information about process failures. Chandra and Toueg defined eight classes of failure detectors, depending on how accurate this information is, and presented an algorithm implementing a failure detector of one of these classes in a partially synchronous system. This algorithm is based on all--to-all communication and periodically exchanges a number of messages that is quadratic on the number of processes. In this paper, we study the implementability of different classes of failure detectors in several models of partial synchrony. We first show that no failure detector with perpetual accuracy (namely, \cal P, \cal Q, \cal S, and \cal W) can be implemented in these models in systems with even a single failure. We also show that, in these models of partial synchrony, it is necessary a majority of correct processes to implement a failure detector of the class \Theta proposed by Aguilera et al. Then, we present a family of distributed algorithms that implement the four classes of unreliable failure detectors with eventual accuracy (namely, \diamond {\cal{P}}, \diamond {\cal{Q}}, \diamond {\cal{S}}, and \diamond {\cal{W}}). Our algorithms are based on a logical ring arrangement of the processes, which defines the monitoring and failure information propagation pattern. The resulting algorithms periodically exchange at most a linear number of messages.

查看译文

关键词

process failure,failure detector,single failure,cal p,cal q,distributed"oracle"thatgivespossiblyincorrecthintsabout which processes of the system have crashed. based on two basic abstract properties namely,unreliable failure detector,partially synchronous systems,partial synchrony,failure information propagation pattern,cal w,unreliable failure detectors,completeness and accuracy,linear number,chandra and toueg proposed eight different classes of unreliablefailuredetectorsandshowedthatconsensuscould be solved in an asynchronous system with any of them. chandra-toueg's model of unreliable failure detectors can be viewed as an abstract way of incorporating partial synchrony assumptions into the model of computat,distributed computing,asynchronous system,algorithm design and analysis,detectors,computational complexity,consensus problem,distributed algorithm,helium,distributed algorithms,computational modeling,distributed systems

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要