Knowledge Aware Semantic Concept Expansion for Image-Text Matching.

IJCAI(2019)

引用 69|浏览116
暂无评分
摘要
Image-text matching is a vital cross-modality task in artificial intelligence and has attracted increasing attention in recent years. Existing works have shown that learning semantic concepts is useful to enhance image representation and can significantly improve the performance of both image-to-text and text-to-image retrieval. However, existing models simply detect semantic concepts from a given image, which are less likely to deal with long-tail and occlusion concepts. Frequently co-occurred concepts in the same scene, e.g. bedroom and bed, can provide common-sense knowledge to discover other semantic-related concepts. In this paper, we develop a Scene Concept Graph (SCG) by aggregating image scene graphs and extracting frequently co-occurred concept pairs as scene common-sense knowledge. Moreover, we propose a novel model to incorporate this knowledge to improve image-text matching. Specifically, semantic concepts are detected from images and then expanded by the SCG. After learning to select relevant contextual concepts, we fuse their representations with the image embedding feature to feed into the matching module. Extensive experiments are conducted on Flickr30K and MSCOCO datasets, and prove that our model achieves state-of-the-art results due to the effectiveness of incorporating the external SCG.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要