Fingerpointing for Hadoop ( CMU-PDL-08-104 )

semanticscholar(2015)

引用 0|浏览3
暂无评分
摘要
Localizing performance problems (or fingerpointing) is essential for distributed systems such as Hadoop that support long-running, parallelized, data-intensive computations over a large cluster of nodes. Manual fingerpointing does not scale in such environments because of the number of nodes and the number of performance metrics to be analyzed on each node. ASDF is an automated, online fingerpointing framework that transparently extracts and parses different time-varying data sources (e.g., sysstat, Hadoop logs) on each node, and implements multiple techniques (e.g., log analysis, correlation, clustering) to analyze these data sources jointly or in isolation. We demonstrate ASDF’s online fingerpointing for documented performance problems in Hadoop, under different workloads; our results indicate that ASDF incurs an average monitoring overhead of 0.38% of CPU time, and exhibits average online fingerpointing latencies of less than 1 minute with false-positive rates of less than 1%. 1ASDF stands for Automated System for Diagnosing Failures Acknowledgements: The authors would like to acknowledge Julio Lopez for discussions on Hadoop, Kathleen Carley for her insights on visualization, and Christos Faloutsos for discussions on data mining. This material is based on research sponsored in part by the National Science Foundation, via CAREER grant CCR-0238381 and grant CNS-0326453.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要