Unsupervised Multi-Modal Representation Learning for High Quality Retrieval of Similar Products at E-commerce Scale

Kushal Kumar, Tarik Arici,Tal Neiman, Jinyu Yang,Shioulin Sam, Yi Xu, Hakan Ferhatosmanoglu,Ismail Tutar

PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023(2023)

引用 0|浏览6
暂无评分
摘要
Identifying similar products in e-commerce is useful in discovering relationships between products, making recommendations, and increasing diversity in search results. Product representation learning is the first step to define a generalized product similarity metric for search. The second step is to extend similarity search to a large scale (e.g., e-commerce catalog scale) without sacrificing quality. In this work, we present a solution that interweaves both steps, i.e., learn representations suited to high quality retrieval using contrastive learning (CL) and retrieve similar items from a large search space using approximate nearest neighbor search (ANNS) to trade-off quality for speed. We propose a CL training strategy for learning uni-modal encoders suited to multi-modal similarity search for e-commerce. We study ANNS retrieval by generating Pareto Frontiers (PFs) without requiring labels. Our CL training strategy doubles retrieval@1 metric across categories (e.g., from 36% to 88% in category C). We also demonstrate that ANNS engine optimization using PFs help select configurations appropriately (e.g., we achieve 6.8x search speed with just 2% drop from the maximum retrieval accuracy in medium size datasets).
更多
查看译文
关键词
Deep Metric Learning,Unsupervised Learning,Approximate Nearest Neighbor Search,Amazon Opensearch,Clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要