Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery
arxiv(2024)
摘要
Unsupervised landmarks discovery (ULD) for an object category is a
challenging computer vision problem. In pursuit of developing a robust ULD
framework, we explore the potential of a recent paradigm of self-supervised
learning algorithms, known as diffusion models. Some recent works have shown
that these models implicitly contain important correspondence cues. Towards
harnessing the potential of diffusion models for the ULD task, we make the
following core contributions. First, we propose a ZeroShot ULD baseline based
on simple clustering of random pixel locations with nearest neighbour matching.
It delivers better results than existing ULD methods. Second, motivated by the
ZeroShot performance, we develop a ULD algorithm based on diffusion features
using self-training and clustering which also outperforms prior methods by
notable margins. Third, we introduce a new proxy task based on generating
latent pose codes and also propose a two-stage clustering mechanism to
facilitate effective pseudo-labeling, resulting in a significant performance
improvement. Overall, our approach consistently outperforms state-of-the-art
methods on four challenging benchmarks AFLW, MAFL, CatHeads and LS3D by
significant margins.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要