Modeling Context In Referring Expressions

COMPUTER VISION - ECCV 2016, PT II(2016)

引用 868|浏览117
暂无评分
摘要
Humans refer to objects in their environments all the time, especially in dialogue with other people. We explore generating and comprehending natural language referring expressions for objects in images. In particular, we focus on incorporating better measures of visual context into referring expression models and find that visual comparison to other objects within an image helps improve performance significantly. We also develop methods to tie the language generation process together, so that we generate expressions for all objects of a particular category jointly. Evaluation on three recent datasets - RefCOCO, RefCOCO+, and RefCOCOg (Datasets and toolbox can be downloaded from https://github.com/lichengunc/refer), shows the advantages of our methods for both referring expression generation and comprehension.
更多
查看译文
关键词
Language, Language and vision, Generation, Referring expression generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要