Self-Decoupling and Ensemble Distillation for Efficient Segmentation.

AAAI(2023)

引用 2|浏览42
暂无评分
摘要
Knowledge distillation (KD) is a promising teacher-student learning paradigm that transfers information from a cumbersome teacher to a student network. To avoid the training cost of a large teacher network, the recent studies propose to distill knowledge from the student itself, called Self-KD. However, due to the limitations of the performance and capacity of the student, the soft-labels or features distilled by the student barely provide reliable guidance. Moreover, most of the Self-KD algorithms are specific to classification tasks based on soft-labels, and not suitable for semantic segmentation. To alleviate these contradictions, we revisit the label and feature distillation problem in segmentation, and propose Self-Decoupling and Ensemble Distillation for Efficient Segmentation (SDES). Specifically, we design a decoupled prediction ensemble distillation (DPED) algorithm that generates reliable soft-labels with multiple expert decoders, and a decoupled feature ensemble distillation (DFED) mechanism to utilize more important channel-wise feature maps for encoder learning. The extensive experiments on three public segmentation datasets demonstrate the superiority of our approach and the efficacy of each component in the framework through the ablation study.
更多
查看译文
关键词
ensemble distillation,efficient segmentation,self-decoupling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要