Rare Disease Prediction By Generating Quality-Assured Electronic Health Records

PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM)(2020)

引用 17|浏览165
暂无评分
摘要
Predicting diseases for patients is an important and practical task in healthcare informatics. Existing disease prediction models focus on common diseases, i.e., there are enough available EHR data and prior medical knowledge for analyzing them. However, those models may not work for rare disease prediction as it is extremely hard to collect enough EHR data with such diseases. To tackle these issues, in this paper, we design a novel rare disease prediction system, which not only generates EHR data but also automatically selects high-quality generated data to further improve the predictive performance. Three components are designed in the system: data generation, data selection, and prediction. In particular, we propose MaskEHR to generate diverse EHR data based on the data from patients suffering from the given diseases. To remove noise information in the generated EHR data, we further design a reinforcement learning-based data selector, called RL-Selector, which can automatically choose the high-quality generated EHR data. Finally, the prediction component is used to identify patients who will potentially suffer the given diseases. These three components work together and enhance each other. Experiments on three real healthcare datasets show that the proposed system outperforms existing approaches on rare disease prediction task.
更多
查看译文
关键词
disease,prediction,health,quality-assured
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要