Cross-Domain Image-Object Retrieval Based on Weighted Optimal Transport

IEEE TRANSACTIONS ON MULTIMEDIA(2023)

引用 0|浏览4
暂无评分
摘要
Given a 2D image query and a pool of 3D objects, the goal of image-object retrieval is to rank the 3D objects according to how well their content fits the query. Previous methods usually project 2D images and 3D objects into a joint embedding space and minimize the distance metric to complete the retrieval task. Since 2D images and 3D objects come from two different domains with large discrepancy, even when 3D objects and 2D images are mapped to a shared space, the gap in feature distribution remains significant, which always leads to domain misalignment. In this work, we propose a novel image-object retrieval method by leveraging optimal transport theory. Specifically, to tackle the dimensionality gap between 2D images and 3D objects, we first represent a 3D object via a sequence of its 2D projections. We then design a Cross-Domain View Attention module (CDVA) to automatically compute the optimal combination of 3D object projections given a 2D query image. Next, we exploit Weighted Optimal Transport (WOT)-based distance to depict the discrepancy between 2D images and 3D objects, and reduce the discrepancy to achieve instance-level alignment. Through this scheme, the transported 2D images and 3D objects with the same label are enforced to follow similar distributions. Finally, we design an explicit Category Centroid Alignment module (CCA) to achieve class-level alignment to improve the retrieval performance. Extensive experiments show that our method can achieve competitive performance on the MI3DOR and MI3DOR-2 benchmarks.
更多
查看译文
关键词
3D Object Retrieval,Cross-Domain Feature Learning,Multi-View Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要