Contribution to the Automatic Recognition of Business Documents

international conference on frontiers in handwriting recognition(2013)

引用 24|浏览13
暂无评分
摘要
The automatic processing of paper documents and mails is a major challenge for all companies. Current recognition systems use modular architectures in which each stage of the process is independent. To improve the performances, it is necessary to reintroduce a co- operation between the different modules, for example by coupling the segmentation / recognition or zones of interests location / segmentation steps. In this context we propose a mixed approach for text localization and image segmentation which respects real time constraints. In the first part, we are going to present the state of the art in text location and thresholding in the images of postal addresses. In the second part, we will describe our method which simultaneously localize and segment text zones. The Location of text blocks obtained from a multiresolution approach on cumulated gradients computed directly from grey level images. The coupling of the two processes (text zones location and thresholding) allows to reduce simultaneously the computing time by processing only necessary parts of the image and by obtaining a better character segmentation for the OCR (Optical Character Recognition). We will present the results obtained from the implementation of our approach on an industrial line which daily processes several tons of documents from large companies.
更多
查看译文
关键词
text location,real time processing,business documents processing.,image segmentation,business
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要