Single-Shot Pruning for Pre-trained Models: Rethinking the Importance of Magnitude Pruning

Hirokazu Kohama,Hiroaki Minoura,Tsubasa Hirakawa,Takayoshi Yamashita,Hironobu Fujiyoshi

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW（2023）

引用 0|浏览0

暂无评分

摘要

Transformer models with large-scale pre-training have performed excellently in various computer vision tasks. However, such models are huge and difficult to apply to mobile devices with limited computational resources. Moreover, the computational cost of fine-tuning is high when the model is optimized for a downstream task. Therefore, our goal is to compress the large pre-trained models with minimal performance degradation before fine-tuning. In this paper, we first present the preliminary experimental results on the parameter change by using pre-trained or scratch models when training in a downstream task. We found that the parameter magnitudes of pre-trained models remained largely unchanged before and after training compared with scratch models. With this in mind, we propose an unstructured pruning method for pre-trained models. Our method evaluates the parameters without training and prunes in a single shot to obtain sparse models. Our experiment results show that the sparse model pruned by our method has higher accuracy is more than previous methods on the CIFAR-10, CIFAR-100, and ImageNet classification tasks.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要