Efficient Crowd Counting via Dual Knowledge Distillation

IEEE TRANSACTIONS ON IMAGE PROCESSING(2024)

引用 0|浏览4
暂无评分
摘要
Most researchers focus on designing accurate crowd counting models with heavy parameters and computations but ignore the resource burden during the model deployment. A real-world scenario demands an efficient counting model with low-latency and high-performance. Knowledge distillation provides an elegant way to transfer knowledge from a complicated teacher model to a compact student model while maintaining accuracy. However, the student model receives the wrong guidance with the supervision of the teacher model due to the inaccurate information understood by the teacher in some cases. In this paper, we propose a dual-knowledge distillation (DKD) framework, which aims to reduce the side effects of the teacher model and transfer hierarchical knowledge to obtain a more efficient counting model. First, the student model is initialized with global information transferred by the teacher model via adaptive perspectives. Then, the self-knowledge distillation forces the student model to learn the knowledge by itself, based on intermediate feature maps and target map. Specifically, the optimal transport distance is utilized to measure the difference of feature maps between the teacher and the student to perform the distribution alignment of the counting area. Extensive experiments are conducted on four challenging datasets, demonstrating the superiority of DKD. When there are only approximately 6% of the parameters and computations from the original models, the student model achieves a faster and more accurate counting performance as the teacher model even surpasses it.
更多
查看译文
关键词
Computational modeling,Adaptation models,Feature extraction,Task analysis,Knowledge transfer,Loss measurement,Estimation,Crowd counting,knowledge transfer,self-knowledge distillation,optimal transport distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要