Reinforcement Learning based Efficient Mapping of DNN Models onto Accelerators

Shine Parekkadan Sunny,Satyajit Das

2022 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)(2022)

引用 0|浏览24
暂无评分
摘要
The input tensors in each layer of Deep Neural Network (DNN) models are often partitioned/tiled to get accommodated in the limited on-chip memory of accelerators. Studies show that efficient tiling schedules (commonly referred to as mapping) for a given accelerator and DNN model reduce the data movement between the accelerator and different levels of the memory hierarchy improving the performance. However, finding layer-wise optimum mapping for a target architecture with a given energy and latency envelope is an open problem due to the huge search space in the mappings. In this paper, we propose a Reinforcement Learning (RL) based automated mapping approach to find optimum schedules of DNN layers for a given architecture model without violating the specified energy and latency constraints. The learned policies easily adapt to a wide range of DNN models with different hardware configurations, facilitating transfer learning improving the training time. Experiments show that the proposed work improves latency and energy consumption by an average of 21.5% and 15.6% respectively compared to the state-of-the-art genetic algorithm-based GAMMA approach for a wide range of DNN models running on NVIDIA Deep Learning Accelerator (NVDLA). The training time of RL-based transfer learning is 15× faster than that of GAMMA.
更多
查看译文
关键词
Tiling,DNN Accelerator,Mapping,DQN,Energy Efficiency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要