Towards Cycle-Consistent Models For Text And Image Retrieval

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV(2018)

引用 12|浏览61
暂无评分
摘要
Cross-modal retrieval has been recently becoming an hotspot research, thanks to the development of deeply-learnable architectures. Such architectures generally learn a joint multi-modal embedding space in which text and images could be projected and compared. Here we investigate a different approach, and reformulate the problem of crossmodal retrieval as that of learning a translation between the textual and visual domain. In particular, we propose an end-to-end trainable model which can translate text into image features and vice versa, and regularizes this mapping with a cycle-consistency criterion. Preliminary experimental evaluations show promising results with respect to ordinary visual-semantic models.
更多
查看译文
关键词
Cross-modal retrieval, Cycle consistency, Visual-semantic models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要