Degraded character recognition from old Kannada documents

International Journal of Electrical and Computer Engineering (IJECE)(2022)

引用 0|浏览0
暂无评分
摘要
This paper addresses preparation of a dataset of Kannada characters which are degraded and robust recognition of such characters. The proposed recognition algorithm extracts the histogram of oriented gradients (HOG) features of block sizes 4x4 and 8x8 followed by principal component analysis (PCA) feature reduction. Various classifiers are experimented with and fine K-nearest neighbor classifier performs best. The performance of proposed model is evaluated using 5-fold cross validation method and receiver operating characteristic curve. The dataset devised is of size 10440 characters having 156 classes (distinct characters). These characters are from 75 pages of not well preserved old books. A comparison of proposed model with other features like Haar wavelet and Geometrical features suggests that proposed model is superior. It is observed that the PCA reduced features followed by fine K-nearest neighbor classifier resulted in the best accuracy with acceptance rate of 98.6% and 97.9% for block sizes of 4x4 and 8x8 respectively. The experimental results show that HOG feature extraction has a high recognition rate and the system is robust even with extensively degraded characters.
更多
查看译文
关键词
old kannada documents,degraded character recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要