An Approach to Analyzing the Layout of Unstructured Digital Documents.

Signal Processing and Communications Applications Conference (SIU)(2022)

引用 0|浏览1
暂无评分
摘要
Extracting personal information from the documents is important to protect personal data. Personal information in a form or document with a standard structure can be determined by using the methods applied in the fields of machine learning, natural language processing, image processing, optical character recognition, etc. However, if the document does not have a standard structure, analyzing the layout of this document and processing the different document or form structures in the document separately may be necessary. There are different methods used for document layout analysis in the literature. However, each of these methods has its own parameters. In this study, a new approach for common parameter tuning is proposed. With the proposed approach, the number of required parameters is reduced and the adjustment range of the parameters is decreased. The proposed approach showed a recall of 94% in the experimental results.
更多
查看译文
关键词
personal data,personal data detection,document layout analysis,image detection,automatic parameter estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要