Learning realistic human actions from movies
CVPR(2008)
摘要
The aim of this paper is to address recognition of natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and annotated video datasets. Our first contri- bution is to address this limitation and to investigate the use of movie scripts for automatic annotation of human ac- tions in videos. We evaluate alternative methods for action retrieval from scripts and show benefits of a text-based clas- sifier. Using the retrieved action samples for visual learn- ing, we next turn to the problem of action classification in video. We present a new method for video classification that builds upon and extends several recent ideas including local space-time features, space-time pyramids and multi- channel non-linear SVMs. The method is shown to improve state-of-the-art results on the standard KTH action dataset by achieving 91.8% accuracy. Given the inherent problem of noisy labels in automatic annotation, we particularly in- vestigate and show high tolerance of our method to annota- tion errors in the training set. We finally apply the method to learning and classifying challenging action classes in movies and show promising results.
更多查看译文
关键词
cinematography,image classification,image retrieval,learning (artificial intelligence),support vector machines,video signal processing,automatic video annotation,human action retrieval,local space-time feature,movie script,multichannel nonlinear SVM,space-time pyramid,text-based classifier,video action classification,video realistic human action recognition,visual learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络