Reducing Fragmentation on 3D Torus-Based HPC Systems Using Packing-Based Job Scheduling and Job Placement Reconfiguration

2017 16th International Symposium on Parallel and Distributed Computing (ISPDC)(2017)

引用 3|浏览20
暂无评分
摘要
We address the topology-aware job scheduling and placement problems on 3D torus-based high performance computing systems, with the objective of reducing system fragmentation. In our previous work, we proposed a job placement algorithm based on a local migration process, which aims at reducing the internal fragmentation due to using a convex prism shape for job allocation. However, HPC systems are prone to suffer from external fragmentation as well. Hence, in this paper, we majorly strive to reduce the external fragmentation brought in by the job scheduling and placement processes. Firstly, from the job scheduling aspect of view, we propose a packing-based job scheduling strategy, which reduces the external fragmentation by using the First Come First Served + backfilling strategy. Secondly, we give a review of the migration-based job placement algorithm in our previous work. Thirdly, in order to reduce the external fragmentation resulting from running jobs scattered across the system, we propose a job placement reconfiguration algorithm, which uses a global migration process to rearrange the placement of the running jobs across the system. Both local and global migration are emulated virtual processes under the off-line scenario, which have no migration overhead. However, under the on-line scenario, migration is a real process and leads to a migration delay. Therefore, we propose a buffer-based on-line scheduling model, which helps to avoid the delay of local migration. The evaluation results validate the efficiency of our approach in reducing system fragmentation and improving system utilization.
更多
查看译文
关键词
Topology-aware,job placement,off-line,on-line,migration,packing-based job scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要