Separating Handwritten Material from Machine Printed Text Using Hidden Markov Models

ICDAR-1(2001)

引用 63|浏览17
暂无评分
摘要
Abstract: In this paper, we address the problem of separating handwritten annotations from machine printed text within a document. We present an algorithm that is based on the theory of hidden Markov models (HMM) to distinguish between machine printed and handwritten materials. No OCR results are required prior to or during the process and classification is performed on a word level. Handwritten annotations are not limited to marginal areas as the approach can deal with document images having handwritten annotations overlaying on machine printed text and shown to be promising in our experiments. Experimental results show that the proposed method can achieve 72:19% recall for fully extracted handwritten words and 90:37% for partially extracted. The precision of extracting handwritten words reaches 92:86%.
更多
查看译文
关键词
hidden markov models,word level,handwritten word,handwritten annotation,marginal area,hidden markov model,ocr result,separating handwritten material,machine printed text,handwritten material,document image,neural networks,handwriting recognition,recall,data mining,engines,precision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要