A Precision-Scalable Deep Neural Network Accelerator With Activation Sparsity Exploitation

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS(2024)

引用 0|浏览5
暂无评分
摘要
To meet the demand in a wide range of practical applications, precision-scalable deep neural network (DNN) accelerators are becoming an unavoidable trend. On the other hand, it has been demonstrated that a DNN accelerator may achieve better computation efficiency through exploiting the sparsity. Therefore, DNN accelerators with both precision scalability and sparsity exploitation are expected to have better performance. In this article, we propose an efficient precision-scalable DNN accelerator that can exploit the sparsity of activations. The precision scalability is obtained from the decomposable multiplier which is inspired by the well-known design, Bit Fusion. Besides, a zero skipping scheme is adopted to leverage the inherent sparsity of activations. We first modify the architecture of the conventional fusion unit (FU) to make it amenable to the zero-skipping scheme. Then, a segmentation approach is devised to tackle the memory access conflict. Furthermore, a sparsity-aware mapping method is proposed to balance the workload of processing elements (PEs). Moreover, we present a bit-splitting strategy which can take advantage of the sparsity in the bit level. Compared with the state-of-the-art precision-scalable designs, our proposed accelerator can provide speedups of 4.12x, 4.07x, and 6.62x in the precision modes 8b x 8b, 4b x 4b, and 2b x 2b, respectively. Meanwhile, it also achieves 3.92x peak area efficiency and competitive peak energy efficiency.
更多
查看译文
关键词
Deep neural networks,hardware accelerator,precision-scalable,sparsity exploitation,zero-skipping
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要