Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2016)

引用 240|浏览133
暂无评分
摘要
When human annotators are given a choice about what to label in an image, they apply their own subjective judgments on what to ignore and what to mention. We refer to these noisy "human-centric" annotations as exhibiting human reporting bias. Examples of such annotations include image tags and keywords found on photo sharing sites, or in datasets containing image captions. In this paper, we use these noisy annotations for learning visually correct image classifiers. Such annotations do not use consistent vocabulary, and miss a significant amount of the information present in an image; however, we demonstrate that the noise in these annotations exhibits structure and can be modeled. We propose an algorithm to decouple the human reporting bias from the correct visually grounded labels. Our results are highly interpretable for reporting "what's in the image" versus "what's worth saying." We demonstrate the algorithm's efficacy along a variety of metrics and datasets, including MS COCO and Yahoo Flickr 100M. We show significant improvements over traditional algorithms for both image classification and image captioning, doubling the performance of existing methods in some cases.
更多
查看译文
关键词
human reporting bias,visual image classifiers,noisy human-centric labels,human annotators,image label,noisy human-centric annotations,image tags,photo sharing sites,visually grounded labels,MS COCO,Yahoo Flickr 100M,image classification,image captioning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要