ADELE: Anomaly Detection from Event Log Empiricism

IEEE INFOCOM(2018)

引用 51|浏览21
暂无评分
摘要
A large population of users gets affected by sudden slowdown or shutdown of an enterprise application. System administrators and analysts spend considerable amount of time dealing with functional and performance bugs. These problems are particularly hard to detect and diagnose in most computer systems, since there is a huge amount of system generated supportability data (counters, logs etc.) that need to be analyzed. Most often, there isn't a very clear or obvious root cause. Timely identification of significant change in application behavior is very important to prevent negative impact on the service. In this paper, we present ADELE, an empirical, data-driven methodology for early detection of anomalies in data storage systems. The key feature of our solution is diligent selection of features from system logs and development of effective machine learning techniques for anomaly prediction. ADELE learns from system's own history to establish the baseline of normal behavior and gives accurate indications of the time period when something is amiss for a system. Validation on more than 4800 actual support cases shows ~ 83% true positive rate and ~ 12% false positive rate in identifying periods when the machine is not performing normally. We also establish the existence of problem “signatures” which help map customer problems to already seen issues in the field. ADELE's capability to predict early paves way for online failure prediction for customer systems.
更多
查看译文
关键词
System Log,Anomaly Detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要