Learning to Disambiguate by Asking Discriminative Questions

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)(2017)

引用 29|浏览115
暂无评分
摘要
The ability to ask questions is a powerful tool to gather information in order to learn about the world and resolve ambiguities. In this paper, we explore a novel problem of generating discriminative questions to help disambiguate visual instances. Our work can be seen as a complement and new extension to the rich research studies on image captioning and question answering. We introduce the first large-scale dataset with over 10,000 carefully annotated images-question tuples to facilitate benchmarking. In particular, each tuple consists of a pair of images and 4.6 discriminative questions (as positive samples) and 5.9 non-discriminative questions (as negative samples) on average. In addition, we present an effective method for visual discriminative question generation. The method can be trained in a weakly supervised manner without discriminative images-question tuples but just existing visual question answering datasets. Promising results are shown against representative baselines through quantitative evaluations and user studies.
更多
查看译文
关键词
visual instances,image captioning,nondiscriminative questions,visual discriminative question generation,discriminative images-question tuples,visual question answering datasets,discriminative questions asking,annotated images-question tuples,positive samples,negative samples
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要