IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity

AAAI 2024(2024)

引用 0|浏览9
暂无评分
摘要
Integrating electronic health records (EHR) into machine learning-driven clinical research and hospital applications is important, as it harnesses extensive and high-quality patient data to enhance outcome predictions and treatment personalization. Nonetheless, due to privacy and security concerns, the secondary purpose of EHR data is consistently governed and regulated, primarily for research intentions, thereby constraining researchers' access to EHR data. Generating synthetic EHR data with deep learning methods is a viable and promising approach to mitigate privacy concerns, offering not only a supplementary resource for downstream applications but also sidestepping the confidentiality risks associated with real patient data. While prior efforts have concentrated on EHR data synthesis, significant challenges persist in the domain of generating synthetic EHR data: balancing the heterogeneity of real EHR including temporal and non-temporal features, addressing the missing values and irregular measures, and ensuring the privacy of the real data used for model training. Existing works in this domain only focused on solving one or two aforementioned challenges. In this work, we propose IGAMT, an innovative framework to generate privacy-preserved synthetic EHR data that not only maintain high quality with heterogeneous features, missing values, and irregular measures but also balances the privacy-utility trade-off. Extensive experiments prove that IGAMT significantly outperforms baseline architectures in terms of visual resemblance and comparable performance in downstream applications. Ablation case studies also prove the effectiveness of the techniques applied in IGAMT.
更多
查看译文
关键词
ML: Deep Generative Models & Autoencoders,ML: Privacy,ML: Time-Series/Data Streams
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要