SparseBNN - Joint Algorithm/Hardware Optimization to Exploit Structured Sparsity in Binary Neural Network.

FPGA(2019)

引用 0|浏览30
暂无评分
摘要
To reduce power-hungry floating point operations and memory accesses in deep neural networks, quantized neural networks are proposed that replace floating point multiplications with simplified reduced-precision operations. To compensate for the accuracy loss due to the high degree of quantization, wider neural network layers with three or more times as many feature maps are employed. One by-product from these inflated layers is increased redundancy in the network. To further improve computational efficiency and leverage this inherent redundancy, we propose a joint optimization approach that simultaneously explores hardware-oriented training and efficient accelerator implementation of binary neural networks (BNN) in FPGAs. More specifically, our SparseBNN method consists of two parts. First, SparseBNN-SW is a training algorithm developed to enhance the structured sparsity of BNNs by 1) training for zero-valued ternary weights instead of binary that are more amenable to pruning and 2) regulating the sparsity for more efficient hardware deployment. Next, we present SparseBNN-HW, an accelerator architecture designed to directly execute the inference on the sparse-encoded format to save both memory access and computations. Experimental results on various representative datasets demonstrate that SparseBNN improves the power efficiency (GOPS/Watt) and resource efficiency (GOPS/kLUT) over the baseline BNN FPGA implementation by 1.70X and 2.22X.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要