Making head and neck cancer clinical data Findable-Accessible-Interoperable-Reusable (FAIR) to support multi-institutional collaboration and federated learning

BJR|Artificial Intelligence(2024)

引用 0|浏览3
暂无评分
摘要
Abstract Objectives Federated learning (FL) is a group of methodologies where statistical modelling can be performed without exchanging identifiable patient data between cooperating institutions. To realize its potential for AI development on clinical data, a number of bottlenecks need to be addressed. One of these is making data Findable-Accessible-Interoperable-Reusable (FAIR). The primary aim of this work is to show that tools making data FAIR allow consortia to collaborate on privacy-aware data exploration, data visualization and training of models on each other’s original data. Methods We propose a “Schema-on-Read” FAIR-ification method that adapts for different (re-)analyses without needing to change the underlying original data. The procedure involves (i) decoupling the contents of the data from its schema and database structure, (ii) annotation with semantic ontologies as a metadata layer and (iii) readout using semantic queries. Open source tools are given as Docker containers to help local investigators prepare their data on-premises. Results We created a federated privacy-preserving visualization dashboard for case mix exploration of five distributed datasets with no common schema at point of origin. We demonstrated robust and flexible prognostication model development and validation, linking together different data sources—clinical risk factors and radiomics. Conclusions Our procedure leads to successful (re-)use of data in FL-based consortia without the need to impose a common schema at every point of origin of data. Advances in Knowledge This work supports the adoption of FL within the healthcare AI community by sharing means to make data more FAIR.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要