Layerweaver: Maximizing Resource Utilization of Neural Processing Units via Layer-Wise Scheduling

2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)(2021)

引用 34|浏览30
暂无评分
摘要
To meet surging demands for deep learning inference services, many cloud computing vendors employ high-performance specialized accelerators, called neural processing units (NPUs). One important challenge for effective use of NPUs is to achieve high resource utilization over a wide spectrum of deep neural network (DNN) models with diverse arithmetic intensities. There is often an intrinsic mismatch...
更多
查看译文
关键词
Schedules,Scheduling algorithms,Computational modeling,Neural networks,Memory management,Random access memory,Bandwidth
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要