Key Information Extraction from Mobile-Captured Vietnamese Receipt Images using Graph Neural Networks Approach

Van Dung Pham,Le Quan Nguyen,Nhat Truong Pham, Bao Hung Nguyen, Due Ngoe Minh Dang,Sy Dzung Nguyen

2022 6th International Conference on Green Technology and Sustainable Development (GTSD)(2022)

引用 0|浏览1
暂无评分
摘要
Information extraction and retrieval are growing fields that have a significant role in document parser and analysis systems. Researches and applications developed in recent years show the numerous difficulties and obstacles in extracting key information from documents. Thanks to the raising of graph theory and deep learning, graph representation and graph learning have been widely applied in information extraction to obtain more exact results. In this paper, we propose a solution upon graph neural networks (GNN) for key information extraction (KIE) that aims to extract the key information from mobile-captured Vietnamese receipt images. Firstly, the images are pre-processed using U 2 -Net, and then a CRAFT model is used to detect texts from the pre-processed images. Next, the implemented TransformerOCR model is employed for text recognition. Finally, a GNN-based model is designed to extract the key information based on the recognized texts. For validating the effectiveness of the proposed solution, the publicly available dataset released from the Mobile-Captured Receipt Recognition (MC-OCR) Challenge 2021 is used to train and evaluate. The experimental results indicate that our proposed solution achieves a character error rate (CER) score of 0.25 on the private test set, which is more comparable with all reported solutions in the MC-OCR Challenge 2021 as mentioned in the literature. For reproducing and knowledge-sharing purposes, our implementation of the proposed solution is publicly available at https://github.com/ThorPhamlKey_infomation_extraction.
更多
查看译文
关键词
Graph Neural Networks,Key Information Extraction,Optical Character Recognition,Text Detection,Text Recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要