HBP: Hierarchically Balanced Pruning and Accelerator Co-Design for Efficient DNN Inference.

DAC(2023)

引用 1|浏览11
暂无评分
摘要
Weight pruning is studied to accelerate DNN inference by reducing the parameters and computations. Irregular pruning achieves high sparsity while incurring low computation parallelism and imbalanced workloads. The coarse-grained structured pruning sacrifices sparsity for higher parallelism. To strike a better balance, we propose Hierarchically Balanced Pruning by applying fine-grained but structured adjustments based on irregular pruning. Besides, it partitions the weight matrix into hierarchical blocks and constrains the sparsity of the blocks for balanced workloads. Furthermore, an accelerator is proposed to unleash the power of the pruning method. Experimental results show our method achieves 1.1x-6x higher sparsity than prior studies, and the accelerator achieves 1.2x-13x speedup and 3.3x energy efficiency improvement than its counterparts.
更多
查看译文
关键词
DNN, structured pruning, sparse DNN accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要