Offline Handwriting Recognition on Devanagari Using a New Benchmark Dataset

2018 13th IAPR International Workshop on Document Analysis Systems (DAS)(2018)

引用 33|浏览29
暂无评分
摘要
Handwriting recognition (HWR) in Indic scripts, like Devanagari is very challenging due to the subtleties in the scripts, variations in rendering and the cursive nature of the handwriting. Lack of public handwriting datasets in Indic scripts has long stymied the development of offline handwritten word recognizers and made comparison across different methods a tedious task in the field. In this paper, we release a new handwritten word dataset for Devanagari, IIIT-HW-Dev to alleviate some of these issues. We benchmark the IIIT-HW-Dev dataset using a CNN-RNN hybrid architecture. Furthermore, using this architecture, we empirically show that usage of synthetic data and cross lingual transfer learning helps alleviate the issue of lack of training data. We use this proposed pipeline on a public dataset, RoyDB and achieve state of the art results.
更多
查看译文
关键词
CNN-RNN Hybrid Network,Devanagari Dataset,Handwriting Recognition,Benchmarking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要