ASA – The Adaptive Scheduling Algorithm
CoRR(2024)
摘要
In High Performance Computing (HPC) infrastructures, the control of resources
by batch systems can lead to prolonged queue waiting times and adverse effects
on the overall execution times of applications, particularly in data-intensive
and low-latency workflows where efficient processing hinges on resource
planning and timely allocation. Allocating the maximum capacity upfront ensures
the fastest execution but results in spare and idle resources, extended queue
waits, and costly usage. Conversely, dynamic allocation based on workflow stage
requirements optimizes resource usage but may negatively impact the total
workflow makespan. To address these issues, we introduce ASA, the Adaptive
Scheduling Algorithm. ASA is a novel, convergence-proven scheduling technique
that minimizes jobs inter-stage waiting times by estimating the queue waiting
times to proactively submit resource change requests ahead of time. It strikes
a balance between exploration and exploitation, considering both learning
(waiting times) and applying learnt insights. Real-world experiments over two
supercomputers centers with scientific workflows demonstrate ASA's
effectiveness, achieving near-optimal resource utilization and accuracy, with
up to 10
makespan, respectively.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要