$\mathcal{G}$-Distillation: Reducing Overconfident Errors on Novel Samples

arXiv: Computer Vision and Pattern Recognition(2018)

引用 24|浏览47
暂无评分
摘要
Counter to the intuition that unfamiliarity should lead to lack of confidence, current algorithms often make highly confident yet wrong predictions when faced with test samples from an unknown distribution different from training. Unlike all domain adaptation methods, we cannot gather an unexpected dataset prior to test. We propose a simple solution that reduces overconfident errors of samples from an unknown distribution without increasing evaluation time: train an ensemble of classifiers and then distill into a single model using both labeled and unlabeled examples. Experimentally, we investigate the overconfidence problem and evaluate our solution by creating and novel test splits, where are identically distributed with training and novel are not. We show that our solution yields more appropriate prediction confidences, on familiar and data, compared to single models and ensembles distilled on training data only. For example, we reduce confident errors in gender recognition by 94% on demographic groups different from the training data.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要