A semi-supervised cross-modal memory bank for cross-modal retrieval

Yingying Huang,Bingliang Hu,Yipeng Zhang,Chi Gao,Quan Wang

Neurocomputing（2024）

引用 0|浏览1

暂无评分

摘要

The core of semi-supervised cross-modal retrieval tasks lies in leveraging limited supervised information to measure the similarity between cross-modal data. Current approaches assume an association between unlabelled data and pre-defined k-nearest neighbour data, relying on classifier performance for this selection. With diminishing labelled data, classifier performance weakens, resulting in erroneous associations among unlabelled instances. Moreover, the lack of interpretability in class probabilities of unlabelled data hinders classifier learning. Thus, this paper focuses on learning pseudo-labels for unlabelled data, providing pseudo-supervision to aid classifier learning. Specifically, a cross-modal memory bank is proposed, dynamically storing feature representations in a common space and class probability representations in a label space for each cross-modal data. Pseudo-labels are derived by computing feature representation similarity and adjusting class probabilities. During this process, imposing constraints on the classification loss between labelled data and contrastive losses between paired cross-modal data is a prerequisite for the successful learning of pseudo-labels. This procedure significantly contributes to enhancing the credibility of these pseudo-labels. Empirical findings demonstrate that using only 10% labelled data, compared to prevailing semi-supervised techniques, this method achieves improvements of 2.6%, 1.8%, and 4.9% in MAP@50 on the Wikipedia, NUS-WIDE, and MS-COCO datasets, respectively.

查看译文

关键词

Common space,Cross-modal memory bank,Pseudo-labels,Class probability

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要