Table Analysis and Information Extraction for Medical Laboratory Reports
2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)(2018)
摘要
Medical laboratory report is one kind of essential document for health care professionals in patient assessment, diagnosis, and long-term monitoring. Compared with paper files, electronic records are convenient for keeping up to date, complete, and accurate, which is already common in modern medical system. But the recognition from historical medical laboratory reports is still in great needs, especially in developing countries. In this paper, we present a document image processing system used for extracting information from medical laboratory reports. Given an image of medical laboratory report, its table areas and texts are firstly segmented following a top-down pipeline. Then, recognition is undergoing for every text that may contain Arabic numerals, mathematical symbols, and multilingual characters. We evaluate the system on a new dataset of medical laboratory reports that includes scanned images and camera-captured images. Our experiments demonstrate that the proposed system can effectively segment the medical document according to its layout and recognize the texts mixed with multi-type characters and symbols to obtain information from medical laboratory reports. The proposed system and the public dataset will benefit the remote healthcare in developing countries.
更多查看译文
关键词
OCR,Layout analysis,Medical application
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络