ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning

IEEE TRANSACTIONS ON MULTIMEDIA(2023)

引用 3|浏览28
暂无评分
摘要
The central idea of contrastive learning is to discriminate between different instances and force different views from the same instance to share the same representation. To avoid trivial solutions, augmentation plays an important role in generating different views, among which random cropping is shown to be effective for the model to learn a generalized and robust representation. Commonly used random crop operation keeps the distribution of the difference between two views unchanged along the training process. In this work, we show that adaptively controlling the disparity between two augmented views along the training process enhances the quality of the learned representations. Specifically, we present a parametric cubic cropping operation, ParamCrop, for video contrastive learning, which automatically crops a 3D cubic by differentiable 3D affine transformations. ParamCrop is trained simultaneously with the video backbone using an adversarial objective, so that it learns to increase the contrastive loss and thus gradually reduces the shared contents between two cropped views. Experiments show that this adaptive and gradual increase in the disparity yielded by ParamCrop is beneficial to learning a strong and generalized representation for downstream tasks, which is shown to be effective on multiple contrastive learning frameworks and video backbones.
更多
查看译文
关键词
Parametric cropping,contrastive learning,video representation learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要