Vicious Cycles in Distributed Software Systems

2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE(2023)

引用 0|浏览4
暂无评分
摘要
A major threat to distributed software systems' reliability is vicious cycles, which are observed when an event in the distributed software system's execution causes a system degradation, and the degradation, in turn, causes more of such events. Vicious cycles often result in large-scale cloud outages that are hard to recover from due to their self-reinforcing nature. This paper formally defines Vicious Cycle, and conducts the first in-depth study of 33 real-world vicious cycles in 13 widely-used open-source distributed software systems, shedding light on the root causes, triggering conditions, and fixing strategies of vicious cycles, with over a dozen concrete implications to combat them. Our findings show that the majority of the vicious cycles are caused by incorrect error handlers, where the handlers do not obtain enough information to distinguish between 1) an error induced by incoming requests and 2) an error induced by an unexpected interference from another error handler. This paper further performs a feasibility study by 1) building a monitoring tool that prevents one type of vicious cycle by collecting information to make a more informed decision in error handling, and 2) investigating the effectiveness of one commonly suggested practice-injecting exponential backoff-to prevent vicious cycles induced by unconstrained retry.
更多
查看译文
关键词
distributed software systems,vicious cycles
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要