Multi-Dimensional Dynamic Pruning: Exploring Spatial and Channel Fuzzy Sparsity

IEEE Transactions on Fuzzy Systems(2024)

引用 0|浏览7
暂无评分
摘要
Dynamic pruning is an effective model compression method to reduce the computational cost of networks. However, existing dynamic pruning methods are limited to pruning along a single dimension (channel, spatial or depth), which cannot maximally excavate the redundancy of the network. Meanwhile, most of the current state-of-the-arts usually implement dynamic pruning via masked-out partial channels and pixels for training, while failing to accelerate the inference speed. To tackle these limitations, we propose a novel fuzzy-based Multi-Dimensional Dynamic Pruning (MDDP) paradigm to dynamically compress neural networks along both the channel and spatial dimensions. Specifically, we design a multi-dimensional fuzzy-mask block to simultaneously learn which spatial positions or channels are redundant and need to be pruned. Then, the Gumbel-Softmax trick combined with a sparsity loss is introduced to train these mask modules in an end-to-end manner. During the testing stage, we convert features and convolution kernels into two matrices respectively, and then implement sparse convolution through matrix multiplication to accelerate the network inference. Extensive experiments demonstrate that our method outperforms existing methods in terms of accuracy and computational cost. For instance, on the CIFAR-10 dataset, our method prunes 68% FLOPs of ResNet-56 with only a 0.07% Top-1 accuracy drop
更多
查看译文
关键词
Deep neural networks,dynamic pruning,model compression,fuzzy sparsity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要