Generalized framework for summarization of fixed-camera lecture videos by detecting and binarizing handwritten content

International Journal on Document Analysis and Recognition (IJDAR)(2019)

引用 11|浏览61
暂无评分
摘要
We propose a framework to extract and binarize handwritten content in lecture videos. The extracted content could potentially be used to index video collections powering content-based search and navigation within lecture videos helping students and educators across the world. A deep learning pipeline is used to detect handwritten text, formulae and sketches and then binarize the extracted content. We exploit the spatio-temporal structure of our binarized detections to compute associativity information of content across all video frames. This information is later used to segment the video. Experiments are conducted to compare the performance of key components of our framework in isolation, as well as the impact on overall performance, with respect to existing methods. We evaluate our framework on the publicly available AccessMath lecture video dataset obtaining an f -measure of 94.32% for binary connected components. Code for the framework (including trained weights) and summarization will be released.
更多
查看译文
关键词
Lecture video summarization, Handwritten text detection, Binarization, Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要