Malleable Scheduling for Flows of MapReduce Jobs

semanticscholar(2013)

引用 0|浏览14
暂无评分
摘要
We consider the problem of scheduling MapReduce workloads. A workload consists of multiple independent flows, and each flow is itself a set of MapReduce jobs with precedence constraints. We model this as a parallel scheduling problem [5, 9], more specifically as precedence constrained malleable scheduling with linear speedup and processor maxima. Each flow is associated with an arbitrary cost function that describes the cost incurred for completing the flow at a particular time. The overall objective is to minimize either the total cost (minisum) or the maximum cost (minimax) of the flows. We use resource augmentation analysis to provide a (2, 3) bicriteria approximation algorithm for general minisum objectives, and a (1, 2) bicriteria approximation algorithm for minimax objectives. We note that (unless P=NP) no finite approximation ratio is possible without resource augmentation. As corollaries of these results, we obtain a 6-approximation algorithm for total weighted completion time (and thus average completion time and average stretch), and a 2-approximation algorithm for maximum weighted completion time (and thus makespan and maximum stretch). We also demonstrate via extensive simulation experiments the overall performance of our algorithms relative to optimal and also relative to other, standard MapReduce scheduling schemes. ∗IBM Almaden Research Center. Email: abalmin@us.ibm.com †IBM T.J. Watson Research Center. Email: {hildrum, viswanath, jlwolf}@us.ibm.com
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要