Validating the Generalizability of Ophthalmic Artificial Intelligence Models on Real-World Clinical Data
TRANSLATIONAL VISION SCIENCE & TECHNOLOGY(2023)
摘要
Purpose: This study aims to investigate generalizability of deep learning (DL) models trained on commonly used public fundus images to an instance of real-world data (RWD) for glaucoma diagnosis. Methods: We used Illinois Eye and Ear Infirmary fundus data set as an instance of RWD in addition to six publicly available fundus data sets. We compared the performance of DL -trained models on public data and RWD for glaucoma classification and optic disc (OD) segmentation tasks. For each task, we created models trained on each data set, respec-tively, and each model was tested on both data sets. We further examined each model's decision-making process and learned embeddings for the glaucoma classification task. Results: Using public data for the test set, public-trained models outperformed RWD-trained models in OD segmentation and glaucoma classification with a mean intersec-tion over union of 96.3% and mean area under the receiver operating characteristic curve of 95.0%, respectively. Using the RWD test set, the performance of public models decreased by 8.0% and 18.4% to 85.6% and 76.6% for OD segmentation and glaucoma classification tasks, respectively. RWD models outperformed public models on RWD test sets by 2.0% and 9.5%, respectively, in OD segmentation and glaucoma classification tasks. Conclusions: DL models trained on commonly used public data have limited ability to generalize to RWD for classifying glaucoma. They perform similarly to RWD models for OD segmentation.
更多查看译文
关键词
generalizability,deep learning,computer-aided diagnosis,glaucoma classification,optic disc segmentation,fundus image
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要