Shift Pruning: Equivalent Weight Pruning for CNN via Differentiable Shift Operator

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 0|浏览2
暂无评分
摘要
Weight pruning is a well-known technique used for network compression. In contrast to filter pruning, weight pruning produces higher compression ratios as it is more fine-grained. However, pruning individual weights results in broken kernels, which cannot be directly accelerated on general platforms, leading to hardware compatibility issues. To address this issue, we propose Shift Pruning (SP), a novel weight pruning method that is compatible with general platforms. SP converts spatial convolutions into regular 1 X 1 convolutions and shift operations, which are simply memory movements that do not require additional FLOPs or parameters. Specifically, we decompose the original K X K convolution into parallel branches of shift-convolution operations and devise the Differentiable Shift Operator (DSO), an approximation form of the actual shift operation, to automatically learn the crucial directions for adequate spatial interactions with the designed shift-related loss function. A regularization term is proposed to prevent redundant shifting, which is beneficial for low-resolution situations. To further improve inference efficacy, we develop a post-training transformation that can construct a more compact model. The introduced channel-wise slimming allows SP to prune in a hybrid-structural manner, catering for both hardware compatibility and a high compression ratio. Extensive experiments on the CIFAR-10 and ImageNet datasets demonstrate that our proposed method achieves superior performance in both accuracy and FLOPs reduction compared to other state-of-the-art techniques. For instance, on ImageNet, we can reduce 48.8% of total FLOPs on ResNet-34 with only 0.22% Top-1 accuracy drop.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要