A Tiny Accelerator for Mixed-Bit Sparse CNN Based on Efficient Fetch Method of SIMO SPad

IEEE Transactions on Circuits and Systems II: Express Briefs(2023)

引用 0|浏览6
暂无评分
摘要
Convolution neural networks (CNNs) have been implemented with custom hardware on edge devices since its algorithms were successful in many artificial intelligence applications. Although lots of unstructured pruning and mix-bit quantization algorithms have been proposed to successfully compress CNNs, there are few hardware accelerators which can support both sparse and mix-bit CNNs. Besides, sparse matrix computation consumes lots of hardware resources such as registers or BRAM to fetch the needed input activations into processing element (PE). This brief presents a tiny accelerator for mixed-bit sparse CNNs featuring a novel scheme of single vector-based compressed sparse filter (CSF) method and single input multiple output scratch pad (SIMO SPad) to effectively compress weight and fetch the needed input activation. SIMO SPad is shared by multiple PEs, which saves 13.34% CLB LUTs, 46.24% CLB Registers. Furthermore, the accelerator supports mixed-bit sparse computation to obtain better accuracy and performance. When tested on VGG16, compared with 8-bit non-sparsity baselines, the performance of mixed-bit sparsity on Cifar10 and ImageNet improved by $4.85\times $ and $3.33\times $ , respectively, with small accuracy decrease degradation. Compared to state-of-the-art accelerators, the accelerator achieves $1.40\times $ to $2.98\times $ greater DSP efficiency, and offers $1.91\times $ greater energy efficiency.
更多
查看译文
关键词
Convolution neural networks,hardware accelerator,unstructured pruning,mixed-bits quantization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要