HSAS: Efficient task scheduling for large scale heterogeneous systolic array accelerator cluster

Kaige Yan, Yanshuang Song, Tao Liu,Jingweijia Tan,Xiaohui Wei,Xin Fu

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE(2024)

引用 0|浏览0
暂无评分
摘要
To efficiently process a large amount of deep neural network models can be challenging, due to significant differences among models and even layers. Nowadays, systolic array has become a common architecture for processing neural networks. With this architecture, different array sizes can lead to huge difference in hardware utilization for the same network. Therefore, to achieve the optimal processing efficiency for a large amount of models, a heterogeneous systolic array accelerator cluster could be more advantageous than a homogeneous architecture. In this work, we propose such heterogeneous architecture, and design its scheduling algorithm HSAS. HSAS can evaluate how models fit with systolic arrays, by our systolic array performance and energy models. Meanwhile, HSAS also takes load balance and preemption into consideration. We further introduce a task decomposition algorithm and subtask priority management table, to enable more fine-grained subtask level scheduling. Our evaluation shows task level HSAS can improve average normalized turnaround time, system throughput and fairness by up to more than 80% compared with classic and state-of-the-art methods, while subtask level HSAS can achieve 18%-63% improvement compared to other methods.
更多
查看译文
关键词
Task scheduling,Systolic array,Heterogeneous computing,Computer architecture,Performance modeling,Energy modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要