A novel multi-modal neural network approach for dynamic and generic sports video summarization

Engineering Applications of Artificial Intelligence(2023)

引用 0|浏览1
暂无评分
摘要
Video Summarization is a video compression/compaction technique to create a shorter yet informative version of original video. Video summarization has offered solutions to plenty of media, user and engineering applications. Though sports video summarization has been an active research topic for some time; there still exists a void for multi-modal, dynamic, generic and domain knowledge based approach for Cricket Sport video summarization. This paper presents a multi-modal video summarization approach to summarize Cricket sport videos. This work captures the domain knowledge acquired from multi-modal (audio-visual) cues. A dual neural network architecture pipeline is proposed to dynamically segment and dynamically summarize Cricket videos for generic target audience. The former Neural Network is grounded on Cricket bowling activity (visual feature) for dynamic video segmentation of Cricket videos. The segments are then forwarded to the latter Neural Network for identification of key segments. The key segment detection module relies on Audio analysis of Cricket video stream to identify exciting, content representative and informative segments as per Cricket domain. Experimental analysis on two novel proposed benchmark datasets, i.e. DPCS (Delivery Play Cricket Sport) image dataset and EXINP (Excited Interval Normal Play) Cricket Dataset (audio based) shows promising results. The results indicate that the proposed multi-modal approach generates exciting, content representative, informative, generic and dynamic summary incorporating domain knowledge of the sport.
更多
查看译文
关键词
Video segmentation,Key segment,Cricket,Deep learning,Dynamic summary,Video summarization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要