Deep Learning Model Compression With Rank Reduction in Tensor Decomposition

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS(2023)

引用 0|浏览9
暂无评分
摘要
Large neural network models are hard to deploy on lightweight edge devices demanding large network bandwidth. In this article, we propose a novel deep learning (DL) model compression method. Specifically, we present a dual-model training strategy with an iterative and adaptive rank reduction (RR) in tensor decomposition. Our method regularizes the DL models while preserving model accuracy. With adaptive RR, the hyperparameter search space is significantly reduced. We provide a theoretical analysis of the convergence and complexity of the proposed method. Testing our method for the LeNet, VGG, ResNet, EfficientNet, and RevCol over MNIST, CIFAR-10/100, and ImageNet datasets, our method outperforms the baseline compression methods in both model compression and accuracy preservation. The experimental results validate our theoretical findings. For the VGG-16 on CIFAR-10 dataset, our compressed model has shown a 0.88% accuracy gain with 10.41 times storage reduction and 6.29 times speedup. For the ResNet-50 on ImageNet dataset, our compressed model results in 2.36 times storage reduction and 2.17 times speedup. In federated learning (FL) applications, our scheme reduces 13.96 times the communication overhead. In summary, our compressed DL method can improve the image understanding and pattern recognition processes significantly.
更多
查看译文
关键词
Deep learning,Deep learning (DL),low-rank decomposition,model compression,rank reduction (RR)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要