HBP: Hierarchically Balanced Pruning and Accelerator Co-Design for Efficient DNN Inference.

Ao Ren,Yuhao Wang,Tao Zhang, Jiaxing Shi,Duo Liu,Xianzhang Chen,Yujuan Tan,Yuan Xie

DAC（2023）

引用 1|浏览11

暂无评分

摘要

Weight pruning is studied to accelerate DNN inference by reducing the parameters and computations. Irregular pruning achieves high sparsity while incurring low computation parallelism and imbalanced workloads. The coarse-grained structured pruning sacrifices sparsity for higher parallelism. To strike a better balance, we propose Hierarchically Balanced Pruning by applying fine-grained but structured adjustments based on irregular pruning. Besides, it partitions the weight matrix into hierarchical blocks and constrains the sparsity of the blocks for balanced workloads. Furthermore, an accelerator is proposed to unleash the power of the pruning method. Experimental results show our method achieves 1.1x-6x higher sparsity than prior studies, and the accelerator achieves 1.2x-13x speedup and 3.3x energy efficiency improvement than its counterparts.

查看译文

关键词

DNN, structured pruning, sparse DNN accelerator

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要