Towards System Modelling to Support Diseases Data Extraction from the Electronic Health Records for Physicians Research Activities
arxiv(2024)
摘要
The use of Electronic Health Records (EHRs) has increased dramatically in the
past 15 years, as, it is considered an important source of managing data od
patients. The EHRs are primary sources of disease diagnosis and demographic
data of patients worldwide. Therefore, the data can be utilized for secondary
tasks such as research. This paper aims to make such data usable for research
activities such as monitoring disease statistics for a specific population. As
a result, the researchers can detect the disease causes for the behavior and
lifestyle of the target group. One of the limitations of EHRs systems is that
the data is not available in the standard format but in various forms.
Therefore, it is required to first convert the names of the diseases and
demographics data into one standardized form to make it usable for research
activities. There is a large amount of EHRs available, and solving the
standardizing issues requires some optimized techniques. We used a first-hand
EHR dataset extracted from EHR systems. Our application uploads the dataset
from the EHRs and converts it to the ICD-10 coding system to solve the
standardization problem. So, we first apply the steps of pre-processing,
annotation, and transforming the data to convert it into the standard form. The
data pre-processing is applied to normalize demographic formats. In the
annotation step, a machine learning model is used to recognize the diseases
from the text. Furthermore, the transforming step converts the disease name to
the ICD-10 coding format. The model was evaluated manually by comparing its
performance in terms of disease recognition with an available dictionary-based
system (MetaMap). The accuracy of the proposed machine learning model is 81
that outperformed MetaMap accuracy of 67
modelling for EHR data extraction to support research activities.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要