NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation
CVPR 2024(2023)
摘要
Data-Free Knowledge Distillation (DFKD) has made significant recent strides
by transferring knowledge from a teacher neural network to a student neural
network without accessing the original data. Nonetheless, existing approaches
encounter a significant challenge when attempting to generate samples from
random noise inputs, which inherently lack meaningful information.
Consequently, these models struggle to effectively map this noise to the
ground-truth sample distribution, resulting in prolonging training times and
low-quality outputs. In this paper, we propose a novel Noisy Layer Generation
method (NAYER) which relocates the random source from the input to a noisy
layer and utilizes the meaningful constant label-text embedding (LTE) as the
input. LTE is generated by using the language model once, and then it is stored
in memory for all subsequent training processes. The significance of LTE lies
in its ability to contain substantial meaningful inter-class information,
enabling the generation of high-quality samples with only a few training steps.
Simultaneously, the noisy layer plays a key role in addressing the issue of
diversity in sample generation by preventing the model from overemphasizing the
constrained label information. By reinitializing the noisy layer in each
iteration, we aim to facilitate the generation of diverse samples while still
retaining the method's efficiency, thanks to the ease of learning provided by
LTE. Experiments carried out on multiple datasets demonstrate that our NAYER
not only outperforms the state-of-the-art methods but also achieves speeds 5 to
15 times faster than previous approaches. The code is available at
https://github.com/tmtuan1307/nayer.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要