Breaking The Boundary For Whole-System Performance Optimization Of Big Data

International Symposium on Low Power Electronics and Design(2013)

引用 11|浏览30
暂无评分
摘要
MapReduce plays an critical role in finding insights in Big Data. The performance optimization of MapReduce programs is challenging because it requires a comprehensive understanding of the whole system including both hardware layers (processors, storages, networks and etc), and software stacks (operating systems, JVM, runtime, applications and etc). However, most of the existing performance tuning and optimization are based on empirical and heuristic attempts. It remains a blank on how to build a systematical framework which breaks the boundary of multiple layers for performance optimization.In this paper, we propose a performance evaluation framework by correlating performance metrics from different layers, which provides insights to efficiently pinpoint the performance issue. This framework is composed of a series of predefined patterns. Each pattern indicates one or more potential issues. The behavior of a MapReduce program is mapped to the corresponding resource utilization. The framework provides a holistic approach which allows users at different levels of experience to conduct MapReduce program performance optimization.We use Terasort benchmark running on a 10-node Power7R2 cluster as a real case to show how this framework improves the performance. By this framework, we finally get the Terasort result improved from 47 mins to less than 8 mins. In addition to the best practice on performance tuning, several key findings are summarized as valuable workload analysis for JVM, MapReduce runtime and application design.
更多
查看译文
关键词
data handling,optimisation,parallel programming,performance evaluation,resource allocation,10-node Power7R2 cluster,JVM,MapReduce program performance optimization,Terasort benchmark,big data,hardware layers,performance evaluation framework,performance tuning,resource utilization,software stacks,whole-system performance optimization,
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要