Creating personalized video summaries via semantic event detection

Journal of Ambient Intelligence and Humanized Computing(2023)

引用 7|浏览52
暂无评分
摘要
Video summarization has great potential in many application areas that enable fast browsing and efficient video indexing. Viewers prefer to browse a video summary containing the contents that they enjoy since watching an entire video may be time-consuming. We believe that it is necessary to create an automated tool that is capable of generating personalized video summaries. In this paper, we propose a new event detection-based personalized video summarization framework and deploy it to create film and soccer video summaries. In order to obtain effective event detection performance, we introduce two transfer learning method. The first event detection method is achieved based on the combination of convolutional neural network and support vector machine (CNNs–SVM). The second method is achieved using a fine-tuned summarization network (SumNet) that fuses fine-tuned object and scene networks. In this study, the training data consists of two datasets: (1) a 21 K set of web images of back hugging, hand shaking, and standing talking used to detect a film event, and (2) a 30 K set of web soccer match images of goals, fouls, and yellow cards to detect soccer events. Given an original video, we first segment it into shots and then use the trained model for event detection. Finally, based on the specification of user preferences, we generate a personalized event-based summary. We test our framework with several film videos and soccer videos. Experimental results demonstrate that the proposed fine-tuned SumNet achieves the best performance of 96.88
更多
查看译文
关键词
Personalized video summarization,Shot boundary detection,Event detection,CNNs-SVM,Fine-tuned SumNet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要