MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
CoRR(2024)
摘要
We introduce MIM (Masked Image Modeling)-Refiner, a contrastive learning
boost for pre-trained MIM models. The motivation behind MIM-Refiner is rooted
in the insight that optimal representations within MIM models generally reside
in intermediate layers. Accordingly, MIM-Refiner leverages multiple contrastive
heads that are connected to diverse intermediate layers. In each head, a
modified nearest neighbor objective helps to construct respective semantic
clusters.
The refinement process is short but effective. Within a few epochs, we refine
the features of MIM models from subpar to state-of-the-art, off-the-shelf
features. Refining a ViT-H, pre-trained with data2vec 2.0 on ImageNet-1K,
achieves new state-of-the-art results in linear probing (84.7
classification among models that are pre-trained on ImageNet-1K. In ImageNet-1K
1-shot classification, MIM-Refiner sets a new state-of-the-art of 64.2
outperforming larger models that were trained on up to 2000x more data such as
DINOv2-g, OpenCLIP-G and MAWS-6.5B. Project page:
https://ml-jku.github.io/MIM-Refiner
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要