Beyond The Deep Metric Learning: Enhance The Cross-Modal Matching With Adversarial Discriminative Domain Regularization

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)（2020）

引用 1|浏览24

暂无评分

摘要

Matching information across image and text modalities is a fundamental challenge for many applications that involve both vision and natural language processing. The objective is to find efficient similarity metrics to compare the similarity between visual and textual information. Existing approaches mainly match the local visual objects and the sentence words in a shared space with attention mechanisms. The matching performance is still limited because the similarity computation is based on simple comparisons of the matching features, ignoring the characteristics of their distribution in the data. In this paper, we address this limitation with an efficient learning objective that considers the discriminative feature distributions between the visual objects and sentence words. Specifically, we propose a novel Adversarial Discriminative Domain Regularization (ADDR) learning framework, beyond the paradigm metric learning objective, to construct a set of discriminative data domains within each image-text pairs. Our approach can generally improve the learning efficiency and the performance of existing metrics learning frameworks by regulating the distribution of the hidden space between the matching pairs. The experimental results show that this new approach significantly improves the overall performance of several popular cross-modal matching techniques (SCAN [13], VSRN [14], BFAN [15]) on the MS-COCO and Flickr3OK benchmarks.

查看译文

关键词

paradigm metric learning objective,discriminative data domains,image-text pairs,learning efficiency,hidden space,matching pairs,popular cross-modal matching techniques,deep metric learning,text modalities,natural language processing,efficient similarity metrics,visual information,textual information,local visual objects,sentence words,shared space,attention mechanisms,matching performance,similarity computation,simple comparisons,matching features,efficient learning objective,discriminative feature distributions,novel Adversarial Discriminative Domain Regularization learning framework

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要