Test-Driven Anonymization in Health Data: A Case Study on Assistive Reproduction

2020 IEEE International Conference On Artificial Intelligence Testing (AITest)(2020)

引用 0|浏览3
暂无评分
摘要
Artificial intelligence (AI) is a broad field whose prevalence in the health sector has increased during recent years. Clinical data are the basic staple that feeds intelligent healthcare applications, but due to its sensitive character, its sharing and usage by third parties require compliance with both confidentiality agreements and security measures. Data Anonymization emerges as a solution to both increasing the data privacy and reducing the risk against unintentional disclosure of sensitive information through data modifications. Although the anonymization improves privacy, the diverse modifications also harm the data functional suitability. These data modifications can affect applications that employ the anonymized data, especially those that are data-centric such as the AI tools. To obtain a trade-off between both qualities (privacy and functional suitability), we use the Test-Driven Anonymization (TDA) approach, which anonymizes incrementally the data to train the AI tools and validates with the real data until maximizing its quality. The approach is evaluated on a real-world dataset from the Spanish Institute for the Study of the Biology of Human Reproduction (INEBIR). The anonymized datasets are used to train AI tools and select the dataset that gets the best trade-off between privacy and functional quality requirements. The results show that TDA can be successfully applied to anonymize the clinical data of the INEBIR, allowing third parties to transfer without transgressing user privacy and develop useful AI Tools with the anonymized data.
更多
查看译文
关键词
Anonymization,Software Testing,Artificial intelligence,Health-Care Data,k-Anonymity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要