SMAN: Stacked Multimodal Attention Network for Cross-Modal Image–Text Retrieval

IEEE Transactions on Cybernetics(2022)

引用 50|浏览431
暂无评分
摘要
This article focuses on tackling the task of the cross-modal image–text retrieval which has been an interdisciplinary topic in both computer vision and natural language processing communities. Existing global representation alignment-based methods fail to pinpoint the semantically meaningful portion of images and texts, while the local representation alignment schemes suffer from the huge computat...
更多
查看译文
关键词
Visualization,Semantics,Feature extraction,Correlation,Task analysis,Extraterrestrial measurements,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要