Smarter peer learning for online knowledge distillation

Multimedia Systems(2022)

引用 2|浏览6
暂无评分
摘要
Model distillation is an effective way to let a less-parameterized student model learn the knowledge of a large teacher model. It requires a well-trained and high-performance model in advance, which limits the application of the deep model in some multimedia devices. However, the powerful teacher is not always available. Given this, some researchers propose a strategy of learning from each student model to replace the traditional teacher–student learning paradigm. Although this way has achieved good results recently, the simple mutual learning between student networks is easy to reach saturation earlier. In this work, we propose a smarter mutual learning method called Smarter Peer Learning (SPL) for online knowledge distillation, which puts forward a weight evaluation mechanism to build a virtual teacher and a novel online distillation framework. The ensemble teacher is constructed by combining the output of student networks through the calculated weight so that students will learn more from the better performance peers in the next stage of the learning. The experiments show that our SPL can train more efficient students than some existing advanced methods by applying various backbone networks to CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets.
更多
查看译文
关键词
Knowledge distillation,Online distillation,Mutual learning,Knowledge transfer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要