Are These the Same Apple? Comparing Images Based on Object Intrinsics

NeurIPS(2023)

引用 0|浏览4
暂无评分
摘要
The human visual system can effortlessly recognize an object under different\nextrinsic factors such as lighting, object poses, and background, yet current\ncomputer vision systems often struggle with these variations. An important step\nto understanding and improving artificial vision systems is to measure image\nsimilarity purely based on intrinsic object properties that define object\nidentity. This problem has been studied in the computer vision literature as\nre-identification, though mostly restricted to specific object categories such\nas people and cars. We propose to extend it to general object categories,\nexploring an image similarity metric based on object intrinsics. To benchmark\nsuch measurements, we collect the Common paired objects Under differenT\nExtrinsics (CUTE) dataset of $18,000$ images of $180$ objects under different\nextrinsic factors such as lighting, poses, and imaging conditions. While\nexisting methods such as LPIPS and CLIP scores do not measure object intrinsics\nwell, we find that combining deep features learned from contrastive\nself-supervised learning with foreground filtering is a simple yet effective\napproach to approximating the similarity. We conduct an extensive survey of\npre-trained features and foreground extraction methods to arrive at a strong\nbaseline that best measures intrinsic object-centric image similarity among\ncurrent methods. Finally, we demonstrate that our approach can aid in\ndownstream applications such as acting as an analog for human subjects and\nimproving generalizable re-identification. Please see our project website at\nhttps://s-tian.github.io/projects/cute/ for visualizations of the data and\ndemos of our metric.
更多
查看译文
关键词
same apple,images
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要