Fast data-free model compression via dictionary-pair reconstruction

KNOWLEDGE AND INFORMATION SYSTEMS(2023)

引用 0|浏览45
暂无评分
摘要
Deep neural network (DNN) obtained satisfactory results on different vision tasks; however, they usually suffer from large models and massive parameters during model deployment. While DNN compression can reduce the memory footprint of deep model effectively, so that the deep model can be deployed on portable devices. However, most of the existing model compression methods cost lots of time, e.g., vector quantization or pruning, which makes them inept to the application that needs fast computation. In this paper, we therefore explore how to accelerate the model compression process by reducing the computation cost. Then, we propose a new model compression method, termed dictionary-pair-based fast data-free DNN compression, which aims at reducing the memory consumption of DNNs without extra training and can greatly improve the compression efficiency. Specifically, our method performs tensor decomposition of DNN model with a fast dictionary-pair learning-based reconstruction approach, which can be deployed on different weight layers (e.g., convolution and fully connected layers). Given a pre-trained DNN model, we first divide the parameters (i.e., weights) of each layer into a series of partitions for dictionary pair-driven fast reconstruction, which can potentially discover more fine-grained information and provide the possibility for parallel model compression. Then, dictionaries of less memory occupation are learned to reconstruct the weights. Moreover, automatic hyper-parameter tuning and shared-dictionary mechanism is proposed to improve the model performance and availability. Extensive experiments on popular DNN models (i.e., VGG-16, ResNet-18 and ResNet-50) showed that our proposed weight compression method can significantly reduce the memory footprint and speed up the compression process, with less performance loss.
更多
查看译文
关键词
Efficient model compression, Dictionary-pair-driven fast DNN compression, Fast weight reconstruction, Less performance loss
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要