Poster: Self-Supervised Quantization-Aware Knowledge Distillation

2023 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING, SEC 2023(2023)

引用 0|浏览4
暂无评分
摘要
Quantization-aware training (QAT) achieves competitive performance and is widely used for image classification tasks in model compression. Existing QAT works start with a pre-trained fullprecision model and perform quantization during retraining. However, these works require supervision from the ground-truth labels whereas sufficient labeled data are infeasible in real-world environments. Also, they suffer from accuracy loss due to reduced precision, and no algorithm consistently achieves the best or the worst performance on every model architecture. To address the aforementioned limitations, this paper proposes a novel Self-Supervised Quantization-Aware Knowledge Distillation framework (SQAKD). SQAKD unifies the forward and backward dynamics of various quantization functions, making it flexible for incorporating the various QAT works. With the full-precision model as the teacher and the low-bit model as the student, SQAKD reframes QAT as a cooptimization problem that simultaneously minimizes the KL-Loss (i.e., the Kullback-Leibler divergence loss between the teacher's and student's penultimate outputs) and the discretization error (i.e., the difference between the full-precision weights/activations and their quantized counterparts). This optimization is achieved in a self-supervised manner without labeled data. The evaluation shows that SQAKD significantly improves the performance of various state-of-the-art QAT works (e.g., PACT, LSQ, DoReFa, and EWGS). SQAKD establishes stronger baselines and does not require extensive labeled training data, potentially making state-of-the-art QAT research more accessible.
更多
查看译文
关键词
Kullback-Leibler,Model Architecture,Worst Performance,Ground Truth Labels,Labeled Training Data,Discretization Error,Self-supervised Manner,Quantum,Convergence Rate,Quantification Method
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要