CANNA: Neural network acceleration using configurable approximation on GPGPU.

Mohsen Imani, Max Masich,Daniel Peroni, Pushen Wang,Tajana Rosing

ASP-DAC(2018)

引用 34|浏览33
暂无评分
摘要
Neural networks have been successfully used in many applications. Due to their computational complexity, it is difficult to implement them on embedded devices. Neural networks are inherently approximate and thus can be simplified. In this paper, CANNA proposes a gradual training approximation which adaptively sets the level of hardware approximation depending on the neural network's internal error, instead of apply uniform hardware approximation. To accelerate inference, CANNA's layer-based approximation approach selectively relaxes the computation in each layer of neural network, as a function its sensitivity to approximation. For hardware support, we use a configurable floating point unit in Hardware that dynamically identifies inputs which produce the largest approximation error and process them instead in precise mode. We evaluate the accuracy and efficiency of our design by integrating configurable FPUs into AMD's Southern Island GPU architecture. Our experimental evaluation shows that CANNA achieves up to 4.84×(7.13×) energy savings and 3.22× (4.64×) speedup when training four different neural network applications with 0% (2%) quality loss as compared to the implementation on baseline GPU. During the inference phase, our layer-based approach improves the energy efficiency by 4.42× (6.06×) and results in 2.96× (3.98×) speedup while ensuring 0% (2%) quality loss.
更多
查看译文
关键词
uniform hardware approximation,gradual training approximation,Neural networks,configurable approximation,Neural network acceleration,configurable floating point unit,CANNA's layer-based approximation approach
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要