Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)(2016)

引用 158|浏览44
暂无评分
摘要
How can unlabeled video augment visual learning? Existing methods perform "slow" feature analysis, encouraging the representations of temporally close frames to exhibit only small differences. While this standard approach captures the fact that high-level visual signals change slowly over time, it fails to capture how the visual content changes. We propose to generalize slow feature analysis to "steady" feature analysis. The key idea is to impose a prior that higher order derivatives in the learned feature space must be small. To this end, we train a convolutional neural network with a regularizer on tuples of sequential frames from unlabeled video. It encourages feature changes over time to be smooth, i.e., similar to the most recent changes. Using five diverse datasets, including unlabeled YouTube and KITTI videos, we demonstrate our method's impact on object, scene, and action recognition tasks. We further show that our features learned from unlabeled video can even surpass a standard heavily supervised pretraining approach.
更多
查看译文
关键词
slow-steady feature analysis,higher-order temporal coherence,visual learning augmentation,temporally close frame representation,high-level visual signals,feature space learning,convolutional neural network training,sequential frame tuples,unlabeled YouTube video,unlabeled KITTI video,object recognition task,scene recognition task,action recognition task
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要