A Unified Video Text Detection Method with Network Flow

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)(2017)

引用 10|浏览19
暂无评分
摘要
Scene text detection in videos has many application needs but has drawn less attention than that in images. Existing methods for video text detection perform unsatisfactorily because of the insufficient utilization of spatial and temporal information. In this paper, we propose a novel video text detection method with network flow based tracking. The system first applies a newly proposed Fully Convolutional Neural Network (FCN) based scene text detection method to detect texts in individual frames and then track proposals in adjacent frames with a motion-based method. Next, the text association problem is formulated into a cost-flow network and text trajectories are derived from the network with a min-cost flow algorithm. At last, the trajectories are post-processed to improve the precision accuracy. The method can detect multi-oriented scene text in videos and incorporate spatial and temporal information efficiently. Experimental results show that the method improves the detection performance remarkably on benchmark datasets, e.g., by a 15.66% increase of ATA Average Tracking Accuracy) on ICDAR video scene text dataset.
更多
查看译文
关键词
video text detection,text tracking,scene text detection,network flow
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要