Privacy-Preserving Collaborative Learning for Genome Analysis via Secure XGBoost

IEEE Transactions on Dependable and Secure Computing(2024)

引用 0|浏览3
暂无评分
摘要
Genomic data is usually stored in a decentralized manner among data providers, who cannot share them publicly due to privacy concerns. A significant technical challenge is to combine machine learning and cryptography techniques to build secure machine learning models over distributed datasets without violating privacy. Therefore, data providers in collaborative machine learning want to maintain the privacy of their genomic data, and the researcher who owns the training model wants to keep the model and training methods confidential. This paper proposes a framework that supports secure collaborative learning tasks without disclosing the participants' genomic data and training model information simultaneously. With the help of a cluster of Intel SGX enclaves, our work performs fast distributed training over these enclaves, and a dedicated enclave is solely used for updating the global model. Also, Secure XGBoost was implemented over these hardware enclaves for fast learning and to enhance the enclaves' security with unique data-oblivious algorithms that eliminate side-channel attacks. From the experimental results, our scheme achieves fast and efficient results in collaborative learning systems without an increase in communication overhead, making it practical for large genomic data.
更多
查看译文
关键词
Genome analysis,Gradient descent,Collaborative learning,Secure XGBoost,Intel-SGX,Privacy-preserving
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要