Autoencoder-based self-supervised hashing for cross-modal retrieval
MULTIMEDIA TOOLS AND APPLICATIONS(2020)
摘要
Cross-modal retrieval has gained lots of attention in the era of the multimedia data explosion. Taking advantage of low storage cost and fast retrieval speed, hash learning-based methods become more and more popular in this field. The crucial bottlenecks of cross-modal retrieval are twofold: the heterogeneous gap in different modalities and the semantic gap among similar data with various modalities. To address these issues, we adopt self-supervised fashion to bridge the heterogeneous gap by generating the cohesive features of different instances. To mitigate the semantic gap, we use triplet sampling to optimize the semantic loss in inter-modal and intra-modal, which increase the discriminability of our approach. Experimental on two benchmark datasets show the efficiency and robustness of our method, and the extended experiments show the scalability.
更多查看译文
关键词
Cross-modal retrieval, Hash learning, Autoencoder, Self-supervised
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络