Deep Semantic Indexing Using Convolutional Localization Network With Region-Based Visual Attention For Image Database

DATABASES THEORY AND APPLICATIONS, ADC 2017(2017)

引用 9|浏览31
暂无评分
摘要
In this paper, we introduce a novel deep semantic indexing method, a.k.a. captioning, for image database. Our method can automatically generate a natural language caption describing an image as a semantic reference to index the image. Specifically, we use a convolutional localization network to generate a pool of region proposals from an image, and then leverage the visual attention mechanism to sequentially generate the meaningful language words. Compared with previous methods, our approach can efficiently generate compact captions, which can guarantee higher level of semantic indexing for image database. We evaluate our approach on two widely-used benchmark datasets: Flickr30K, and MS COCO. Experimental results across various evaluation metrics show the superiority of our approach as compared with other visual attention based approaches.
更多
查看译文
关键词
Semantic indexing, Image database, Visual attention, Region proposals, Convolutional localization network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要