Enhancing optical character recognition: Efficient techniques for document layout analysis and text line detection

ENGINEERING REPORTS(2023)

引用 0|浏览0
暂无评分
摘要
In recent years, automatic document and text analysis has gained significant importance, driven by advancements in optical character recognition (OCR) technology and the need for efficient processing of large volumes of printed or handwritten documents. This article specifically focuses on document layout analysis (DLA) and text line detection (TLD), both of which are crucial components of OCR systems. Our objective is to develop an effective method for extracting both textual and non-textual regions, addressing challenges unique to the Persian (and Persian-like) language(s). In the DLA stage, we employ deep learning models and a voting system to accurately determine the regions of interest. Additionally, we introduce methods such as optimum font size concepts, angle correction, and a line curvature elimination algorithm in the TLD process to enhance OCR accuracy. Comparative evaluations against state-of-the-art methods demonstrate the superiority of our approach, showcasing a 2.8% improvement in the accuracy of Tesseract-OCR 5.1.0 (a well-established commercial OCR system) on the official Iranian newspapers dataset. These findings underscore the importance of addressing DLA and TLD challenges to advance OCR technology for Persian language documents and provide a solid foundation for future research in this domain. Our proposed method introduces several key novelties that contribute to the advancement of optical character recognition (OCR) systems. We collected and presented a valuable dataset for training and evaluating OCR models. Our proposed method successfully addresses challenges associated with document layout analysis (DLA) and text line detection in OCR systems, particularly for the Persian language. We significantly improve the accuracy of OCR systems by employing deep learning models in the DLA stage and implementing a voting system, as well as introducing angle correction methods, optimum font size concepts, and an efficient algorithm to eliminate line curvature.image
更多
查看译文
关键词
connected component,document layout analysis,font size,line detection,Persian printed
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要