Large-scale weakly-supervised pre-training for video action recognition
arXiv: Computer Vision and Pattern Recognition, 2019.
Current fully-supervised video datasets consist of only a few hundred thousand videos and fewer than a thousand domain-specific labels. This hinders the progress towards advanced video architectures. This paper presents an in-depth study of using large volumes of web videos for pre-training video models for the task of action recognition....More
PPT (Upload PPT)