Cooperative Job Scheduling and Data Allocation in Data-Intensive Parallel Computing Clusters

ICPP '19: Proceedings of the 48th International Conference on Parallel Processing(2023)

引用 2|浏览78
暂无评分
摘要
In data-intensive parallel computing clusters, it is important to provide deadline-guaranteed service to jobs while minimizing resource usage (e.g., network bandwidth and energy). Under the current computing framework (that first allocates data and then schedules jobs), in a busy cluster with many jobs, it is difficult to achieve high data locality (hence low bandwidth consumption), deadline guarantee, and high energy savings simultaneously. We model the problem to simultaneously achieve these three objectives using integer programming. Due to the NP-hardness of the problem, we propose a heuristic Cooperative job Scheduling (CSA) and data Allocation method. CSA novelly reverses the order of data allocation and job scheduling in the current computing framework. Job-scheduling-first enables CSA to proactively consolidate tasks with more common requested data to the same server when conducting deadline-aware scheduling, and also consolidate the tasks to as few servers as possible to maximize energy savings. This facilitates the subsequent data allocation step to allocate a data block to the server that hosts most of this data's requester tasks, thus maximally enhancing data locality. To achieve the tradeoff between data locality and energy savings with specified weights, CSA has a cooperative recursive refinement process that recursively adjusts the job schedule and data allocation schedule. We further propose two enhancement algorithms (i.e., minimum k-cut data reallocation algorithm and bipartite based task reassignment algorithm) to further improve the performance of CSA through additional data reallocation and task reassignment, respectively. Trace-driven experiments in the simulation and the real cluster show that CSA outperforms other schedulers in supplying deadline-guarantee and resource-efficient services and the effectiveness of each enhancement. Also, the enhancement algorithms are effective in improving CSA.
更多
查看译文
关键词
Task analysis,Servers,Resource management,Schedules,Clustering algorithms,Processor scheduling,Costs,Job scheduler,data allocation,parallel computing,data locality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要