RICH: A rapid method for image-text cross-modal hash retrieval

Displays(2023)

引用 2|浏览4
暂无评分
摘要
Deep cross-modal hash retrieval (DCMHR) methods can effectively analyze the correlation of multimodal data while maintaining efficiency. However, to pursue better accuracy, most existing hash methods forget that the original purpose of introducing hash technology is to reduce training consumption, and overtraining also leads to overfitting. This paper proposes a rapid method for image-text cross-modal hash retrieval (RICH) based on DenseNet and multi-head attention (MHA) BOW (Bag of Words), which makes full use of unlabeled samples and uses Early Stop in training. To fully extract image case features, we propose multiple dense feature sampling for cross-modal retrieval. It is worth noting that this method applies DenseNet and Early Stop to unsupervised cross-modal retrieval for the first time and greatly reduces training costs while keeping good results. Then it is discussed that the MHA be carried in the TxtNet, which can extract neglected features. Furthermore, to alleviate the heterogeneous gap between different modalities, we also use the auxiliary similarity metrics. Experiments on three datasets show that the average performance of this method is higher than most of the DCMHR methods. In addition, compared to most of the SoTA unsupervised DCMHR methods, the training cost and stability of RICH are more excellent, which proves the effectiveness and superiority of this method.
更多
查看译文
关键词
Cross-modal hash retrieval,Unsupervised learning,Multiple dense feature sampling,Multi-head attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要