An image retrieval method based on semantic matching with multiple positional representations

Multimedia Tools and Applications(2019)

引用 2|浏览8
暂无评分
摘要
Text-based image retrieval requires manual annotation or automatic labeling of the machine. Manual annotation is time-consuming, and simple text description is difficult to fully express the content of the image. Existing deep models rely on the representation of a single sentence, and such methods cannot well capture the contextualized local information in the matching process. In response to these problems, this paper presents a new retrieval idea based on image caption. First, the image description sentences of images are generated by using the image caption model. Then, for the sentence matching model, we propose a multiple positional representations semantic matching model. We use two interrelated Bi-LSTMs and the attention mechanism to match sentences. the matching score is finally produced by aggregating interactions between these different positional sentence representations. The sentence matching model is used to match the retrieval sentence with the image description sentences in the image library. In our experiments, the accuracy of the proposed image caption model and the sentence matching model are all improved compared with the competitive models, and our method can complete the image retrieval task.
更多
查看译文
关键词
Image retrieval,Image caption,Sentence matching,Multiple positional representations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要