SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition
arXiv: Computer Vision and Pattern Recognition, pp. 6232-6242, 2019.
While many action recognition datasets consist of collections of brief, trimmed videos each containing a relevant action, videos in the real-world (e.g., on YouTube) exhibit very different properties: they are often several minutes long, where brief relevant clips are often interleaved with segments of extended duration containing little ...More
PPT (Upload PPT)