Exploiting Variable Precision Computation Array for Scalable Neural Network Accelerators

2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)(2020)

引用 0|浏览67
暂无评分
摘要
In this paper, we present a flexible Variable Precision Computation Array (VPCA) component for different accelerators, which leverages a sparsification scheme for activations and a low bits serial-parallel combination computation unit for improving the efficiency and resiliency of accelerators. The VPCA can dynamically decompose the width of activation/weights (from 32bit to 3bit in different accelerators) into 2-bits serial computation units while the 2bits computing units can be combined in parallel computing for high throughput. We propose an on-the-fly compressing and calculating strategy SLE-CLC (single lane encoding, cross lane calculation), which could further improve performance of 2-bit parallel computing. The experiments results on image classification datasets show VPCA can outperforms DaDianNao, Stripes, Loom-2bit by 4.67×, 2.42×, 1.52× without other overhead on convolution layers.
更多
查看译文
关键词
Deep Neural Networks,Accelerator,Energy Efficiency Computing Array,Dynamic Quantization,Resiliency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要