Handwritten Script Identification from Text Lines

arxiv(2020)

引用 0|浏览6
暂无评分
摘要
In a multilingual country like India where 12 different official scripts are in use, automatic identification of handwritten script facilitates many important applications such as automatic transcription of multilingual documents, searching for documents on the web/digital archives containing a particular script and for the selection of script specific Optical Character Recognition (OCR) system in a multilingual environment. In this paper, we propose a robust method towards identifying scripts from the handwritten documents at text line-level. The recognition is based upon features extracted using Chain Code Histogram (CCH) and Discrete Fourier Transform (DFT). The proposed method is experimented on 800 handwritten text lines written in seven Indic scripts namely, Gujarati, Kannada, Malayalam, Oriya, Tamil, Telugu, Urdu along with Roman script and yielded an average identification rate of 95.14% using Support Vector Machine (SVM) classifier.
更多
查看译文
关键词
lines,identification,text
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要