Adaptive Ladder Loss for Learning Coherent Visual-Semantic Embedding.

IEEE Transactions on Multimedia(2023)

引用 0|浏览13
暂无评分
摘要
For visual-semantic embedding, the existing methods normally treat the relevance between queries and candidates in a bipolar way - relevant or irrelevant, and all "irrelevant" candidates are uniformly pushed away from the query by an equal margin in the embedding space, regardless of their various proximity to the query. This practice disregards relatively discriminative information and could lead to suboptimal ranking in the retrieval results and poorer user experience, especially in the long-tail query scenario where a matching candidate may not necessarily exist. In this paper, we introduce a continuous variable to model the relevance degree between queries and multiple candidates, and propose to learn a coherent embedding space, where candidates with higher relevance degrees are mapped closer to the query than those with lower relevance degrees. In particular, the new ladder loss is proposed by extending the triplet loss inequality to a more general inequality chain, which implements variable push-away margins according to respective relevance degrees. To adapt to the varying mini-batch statistics and improve the efficiency of the ladder loss, we also propose a Silhouette score-based method to adaptively decide the ladder level and hence the underlying inequality chain. In addition, a proper Coherent Score metric is proposed to better measure the ranking results including those "irrelevant" candidates. Extensive experiments on multiple datasets validate the efficacy of our proposed method, which achieves significant improvement over existing state-of-the-art methods.
更多
查看译文
关键词
Coherent visual-semantic embedding,adaptive ladder loss,hard-contrastive sampling,coherent score
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要