Online Fingerpointing: Just-in-Time Problem Diagnosis for Distributed Systems

msra

引用 23|浏览7
暂无评分
摘要
Distributed systems are growing both in size and complexity. In the event of a system failure, this makes it increasingly difficult for systems adminis- trators to determine which component failed. Exist- ing tools and algorithms have been designed to di- agnose problems, but they rely on offline analysis. This work explores the possibility of online failure diagnosis that operates as the distributed system un- der observation is running. A framework for online fingerpointing is presented and evaluated.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要