Multi-objective Hadoop Configuration Optimization Using Steady-State NSGA-II.

Joint International Conference on Soft Computing and Intelligent Systems SCIS and International Symposium on Advanced Intelligent Systems ISIS(2016)

引用 4|浏览14
暂无评分
摘要
Hadoop configuration optimization is very challenging because of the complexity of its framework. And optimized Hadoop parameter configuration settings depend significantly on the performance of MapReduce applications in the cluster. Although much research has been conducted on Hadoop parameters configuration optimization, configuring its resource setting parameters to minimize the execution time of MapReduce jobs in clusters still needs a lot of continuing researches. Further, determining the type of machine instances that should be used to minimize the resource usage cost for executing applications in clusters is also difficult. This paper addresses these problems by optimizing the instance resource usage and execution time of MapReduce tasks using a multi-objective steady-state Non-dominated Sorting Genetic Algorithm II (ssNSGA-II) approach. In this approach, the instance resource usage cost of MapReduce tasks is calculated based on the cost of machine instance types and the number of machine instances in the Hadoop cluster. The optimized configuration is identified by selecting an optimal setting that satisfies two objective functions associated with instance resource usage and execution time minimization, from Pareto optimal front solutions. Although dynamic machine instance type is considered within the search process in our system, dynamic cluster size is out of consideration and intended to be carried out in our future. Experiments conducting using workloads from the HiBench benchmark on a high specification 6-node Hadoop cluster verify the efficacy of our proposed approach.
更多
查看译文
关键词
MapReduce,Multi-objective Optimization,Steady-State NSGA-II,Hadoop Parameter Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要