Metaquant: Learning To Quantize By Learning To Penetrate Non-Differentiable Quantization

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019)(2019)

引用 45|浏览150
暂无评分
摘要
Tremendous amount of parameters make deep neural networks impractical to be deployed for edge-device-based real-world applications due to the limit of computational power and storage space. Existing studies have made progress on learning quantized deep models to reduce model size and energy consumption, i.e. converting full-precision weights (r's) into discrete values (q's) in a supervised training manner. However, the training process for quantization is non-differentiable, which leads to either infinite or zero gradients (gr) w.r.t. r. To address this problem, most training-based quantization methods use the gradient w.r.t. q (gq) with clipping to approximate gr by Straight-Through-Estimator (STE) or manually design their computation. However, these methods only heuristically make training-based quantization applicable, without further analysis on how the approximated gradients can assist training of a quantized network. In this paper, we propose to learn gr by a neural network. Specifically, a meta network is trained using gq and r as inputs, and outputs gr for subsequent weight updates. The meta network is updated together with the original quantized network. Our proposed method alleviates the problem of non-differentiability, and can be trained in an end-to-end manner. Extensive experiments are conducted with CIFAR10/100 and ImageNet on various deep networks to demonstrate the advantage of our proposed method in terms of a faster convergence rate and better performance. Codes are released at: https://github.com/csyhhu/MetaQuant
更多
查看译文
关键词
neural networks,energy consumption,storage space
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要