TEVI: Text Extraction for Video Indexing

msra(2007)

引用 23|浏览14
暂无评分
摘要
Efficient indexing and retrieval of digital video is an important aspect of video databases. One powerful index for retrieval is the text appearing in them. It enables content based browsing. In this paper, we describe a system for detecting and extracting text appearing in video frames A supervised learning method based on color and edge information is used to detect text regions. After an unsupervised clustering for text segmentation and binarization is applied using color information. Experimental results demonstrate that the proposed approach is robust for font-size, font-color, background complexity and language. Key-words- video OCR, multi-frame integration, text detection, localization, segmentation, neural networks, fuzzy C-means. I. Introduction The video is a topic of paramount importance which continues to be dealt with directly as a basic (non-decomposable) object in multimedia documents. Its contents remain rarely clarified and it is often very difficult to classify and extract any knowledge. In many applications, such as the indexing and research through content, we are asked to reach the internal structure of the video, and to lie out or handle data of finer granularity, such as the text or visual objects. Classification and the annotation are usually carried the out manually according to a list of keywords chosen by user. This technique is tiresome and the automation of the indexing process is of great interest. The extraction of relevant information like the text can really provide us with additional data regarding the semantic contents of these videos. Nevertheless, the detection and the text recognition encounter several problems. If it is often relatively well contrasted in relation to its environment, the text can be found superimposed at a heterogamous and complex bottom. Moreover the text can be multicoloured and heterogeneous. These characteristics make its extraction difficult. In this paper, we propose an approach to automatically localize, segment and binarize text appearing in video frames. We first apply a new multiple frames integration (MFI) method to minimize the variation of the background of the video frames. Second, a supervised learning method based on color and edge information is used to efficiently detect text regions. Third, an unsupervised clustering for text segmentation and binarization is applied using color information.. II. Text Extraction from Video
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要