Correlational Image Modeling for Self-Supervised Visual Pre-Training.

Computing Research Repository (CoRR)（2023）

Nanyang Technological University S-Lab

Cited 14|Views50

Abstract

We introduce Correlational Image Modeling (CIM), a novel and surprisingly effective approach to self-supervised visual pre-training. Our CIM performs a simple pretext task: we randomly crop image regions (exemplars) from an input image (context) and predict correlation maps between the exemplars and the context. Three key designs enable correlational image modeling as a nontrivial and meaningful self-supervisory task. First, to generate useful exemplar-context pairs, we consider cropping image regions with various scales, shapes, rotations, and transformations. Second, we employ a bootstrap learning framework that involves online and target encoders. During pre-training, the former takes exemplars as inputs while the latter converts the context. Third, we model the output correlation maps via a simple cross-attention block, within which the context serves as queries and the exemplars offer values and keys. We show that CIM performs on par or better than the current state of the art on self-supervised and transfer benchmarks. Code is available at https://github.com/weivision/Correlational-Image-Modeling.git.

Translated text

Key words

Self-supervised or unsupervised representation learning

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Related Papers

Reference papers

Cited Papers

Object tracking: A survey

Alper Yilmaz,Omar Javed,Mubarak Shah

2006

被引用7626 | 浏览

ImageNet: A Large-Scale Hierarchical Image Database

Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei

2009

被引用85312 | 浏览

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Nitish Srivastava,Geoffrey Hinton,Alex Krizhevsky,Ilya Sutskever,Ruslan Salakhutdinov

2014

被引用57068 | 浏览

Explaining and Harnessing Adversarial Examples

Ian J. Goodfellow,Jonathon Shlens,Christian Szegedy

2015

被引用25973 | 浏览

Visual Object Tracking Using Adaptive Correlation Filters

David S. Bolme,J. Ross Beveridge,Bruce A. Draper,Yui Man Lui

2010

被引用4481 | 浏览

Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles.

Mehdi Noroozi,Paolo Favaro

2016

被引用3812 | 浏览

Fully-Convolutional Siamese Networks for Object Tracking

Luca Bertinetto,Jack Valmadre,Joao F. Henriques,Andrea Vedaldi,Philip H. S. Torr

2016

被引用5586 | 浏览

Semantic Understanding of Scenes Through the ADE20K Dataset

Bolei Zhou,Hang Zhao,Xavier Puig,Tete Xiao,Sanja Fidler,Adela Barriuso,Antonio Torralba

2018

被引用2330 | 浏览

Unsupervised Feature Learning Via Non-parametric Instance Discrimination

Zhirong Wu,Yuanjun Xiong,Stella X. Yu,Dahua Lin

2018

被引用4806 | 浏览

High Performance Visual Tracking with Siamese Region Proposal Network

Bo Li,Junjie Yan,Wei Wu,Zheng Zhu,Xiaolin Hu

2018

被引用3329 | 浏览

Unsupervised Deep Tracking.

Ning Wang,Yibing Song,Chao Ma,Wengang Zhou,Wei Liu,Houqiang Li

2019

被引用487 | 浏览

Natural Adversarial Examples.

Dan Hendrycks,Kevin Zhao,Steven Basart,Jacob Steinhardt,Dawn Song

2021

被引用1915 | 浏览

Momentum Contrast for Unsupervised Visual Representation Learning.

Kaiming He,Haoqi Fan,Yuxin Wu,Saining Xie,Ross Girshick

2020

被引用15910 | 浏览

MAST: A Memory-Augmented Self-Supervised Tracker.

Zihang Lai,Erika Lu,Weidi Xie

2020

被引用229 | 浏览

Intermediate-Task Transfer Learning with Pretrained Language Models: when and Why Does It Work?

Yada Pruksachatkun,Jason Phang,Haokun Liu,Phu Mon Htut,Xiaoyi Zhang,Richard Yuanzhe Pang,Clara Vania,Katharina Kann,Samuel R. Bowman

2020

被引用325 | 浏览

SCAN: Learning to Classify Images Without Labels

Wouter Van Gansbeke,Simon Vandenhende,Stamatios Georgoulis,Marc Proesmans,Luc Van Gool

2020

被引用724 | 浏览

Generative Pretraining from Pixels.

Mark Chen,Alec Radford,Rewon Child,Jeff Wu,Heewoo Jun,David Luan,Ilya Sutskever

2020

被引用2066 | 浏览

Delving into Inter-Image Invariance for Unsupervised Visual Representations

Jiahao Xie,Xiaohang Zhan,Ziwei Liu,Yew-Soon Ong,Chen Change Loy

2022

被引用56 | 浏览

Transformer Tracking

Xin Chen,Bin Yan,Jiawen Zhu,Dong Wang,Xiaoyun Yang,Huchuan Lu

2021

被引用1455 | 浏览

Emerging Properties in Self-Supervised Vision Transformers.

Mathilde Caron,Hugo Touvron,Ishan Misra,Herve Jegou,Julien Mairal,Piotr Bojanowski,Armand Joulin

2021

被引用7732 | 浏览

Progressive Unsupervised Learning for Visual Object Tracking

Qiangqiang Wu,Jia Wan,Antoni B. Chan

2021

被引用53 | 浏览

ResNet Strikes Back: an Improved Training Procedure in Timm

Ross Wightman,Hugo Touvron,Hervé Jégou

2021

被引用630 | 浏览

A Note on the Time to Failure of a Two-Unit Parallel Redundant System with Deterioration on a Lattice

Tadashi Dohi,Junjun Zheng,Hiroyuki Okamura

2020

被引用3 | 浏览

VEGF Gene Polymorphisms Regulate Human Retinal Vascular Endothelial Cell Proliferation and Apoptosis Through ASF/SF2-associated Alternative Splicing

Honghui Li, Jun Xie,Junwen Zeng,Juan Wu,Jin Zhou,Wei Zhao

2022

被引用7 | 浏览

Object tracking

Alper Yilmaz,Omar Javed,Mubarak Shah

2006

被引用2657 | 浏览

MixFormer: End-to-End Tracking with Iterative Mixed Attention

Yutao Cui,Cheng Jiang,Gangshan Wu,Limin Wang

2024

被引用794 | 浏览

Unsupervised Learning of Accurate Siamese Tracking

Qiuhong Shen,Lei Qiao,Jinyang Guo,Peixia Li,Xin Li,Bo Li,Weitao Feng,Weihao Gan,Wei Wu,Wanli Ouyang

2022

被引用77 | 浏览

Transformer Tracking with Cyclic Shifting Window Attention

Zikai Song,Junqing Yu,Yi-Ping Phoebe Chen,Wei Yang

2022

被引用186 | 浏览

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

Jiahao Xie,Wei Li,Xiaohang Zhan,Ziwei Liu,Yew-Soon Ong,Chen Change Loy

2023

被引用97 | 浏览

Deception Attacks on Kalman Filtering with Interval Estimation Performance Loss

Jing Zhou,Jun Shang,Tongwen Chen

2022

被引用4 | 浏览

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

【要点】：引入了一种新颖但出人意料有效的自监督视觉预训练方法——相关性图像建模（CIM）。通过在输入图像的随机裁剪区域（例子）和上下文之间预测相关性图，完成一个简单的假设任务。

【方法】：采用三个关键设计来实现相关性图像建模作为一个非平凡且有意义的自监督任务。首先，考虑使用各种比例、形状、旋转和变换裁剪图像区域以生成有用的例子-上下文对。其次，采用了引导式学习框架，包括在线网络和目标网络。在预训练过程中，前者将例子作为输入，后者将上下文转换为特征。第三，通过一个简单的交叉注意力块对输出的相关性图进行建模，其中上下文作为查询，例子提供值和键。

【实验】：展示了CIM在自监督和迁移基准测试中与当前最先进方法相当或更好的性能，使用了数据集名称。

注意：由于缺少作者提供的实际数据集名称和具体结果，无法在回答中添加相关信息，但可以根据上述提供的概括要点、方法和实验进行格式化的回答。

去 AI 文献库对话