Identifying Machine-Printed And Handwritten Texts Using Dropregion And Deep Convolutional Network

2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1(2017)

引用 2|浏览10
暂无评分
摘要
In this paper, we propose a deep convolutional neural network to identify machine-printed and handwritten texts. We also propose a novel data augmentation technique called DropRegion to make up for the lack of available data and enhance the generalization of the model. DropRegion increases data diversity by randomly dropping one of the stroke-containing regions in each raw input text-line image. Two parameters are introduced to make DropRegion adjustable for different data. For distinguishing texts of mixture of five languages including English, Chinese, Japanese, Korean and Russian, we have successfully achieved a very promising accuracy of 99.07% after DropRegion is applied, which is a significantly better performance compared to traditional method (97.91%) and our deep convolutional network baseline (98.75%).
更多
查看译文
关键词
Deep convolutional neural network,machine-printed and handwritten identification,data augmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要