Scheduling of Elastic Message Passing Applications on HPC Systems.

JSSPP(2022)

引用 0|浏览1
暂无评分
摘要
Elastic parallel applications that can change the number of processors while being executed promise improved application and system performance, allow new classes of data and event-driven highly dynamic parallel applications, as well as provide the possibility of predictive proactive fault tolerance via shrinkage in increasingly larger and more complex HPC systems, where the mean time between component failures is decreasing. There are several challenges for elastic application to become mainstream: 1) a clear understanding of programming models for elastic applications, 2) adequate support from message passing libraries, middleware, and resource management systems (RMS), and 3) thorough investigation of scheduling algorithms. Scheduling elastic jobs requires communication between running jobs and the RMS, keeping track of pending jobs, and prioritizing jobs to expand or shrink at a certain point in time. These challenges make the task of finding an optimal schedule challenging. We have proposed three different scheduling algorithms to schedule elastic applications along with six different candidate selection policies to prioritize the shrinkable applications and investigated their impact on system and application performance. We have studied the impact of workload characteristics and algorithms on performance. Our simulations results indicate that workload characteristics as well as the range of elasticity (flexibility) of the elastics applications impact the system and application performance.
更多
查看译文
关键词
Elastic applications, Malleable, Evolving, Scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要