ChatGPT4 outperforms endoscopists for determination of post-colonoscopy re-screening and surveillance recommendations

Clinical Gastroenterology and Hepatology(2024)

引用 0|浏览0
暂无评分
摘要
Background Large language models (LLM) including ChatGPT4 improve access to artificial intelligence, but their impact on the clinical practice of gastroenterology is undefined.In this study, we aim to compare the accuracy, concordance and reliability of ChatGPT4 colonoscopy recommendations for colorectal cancer re-screening and surveillance to contemporary guidelines and real-world gastroenterology practice. Methods History of present illness, colonoscopy data and pathology reports from patients undergoing procedures at two large academic centers were entered into ChatGPT4 and it was queried for next recommended colonoscopy follow-up interval. Using McNemar’s test and inter-rater reliability, we compared the recommendations made by ChatGPT4 with the actual surveillance interval provided in the endoscopist’s procedure report (gastroenterology practice) and the appropriate USMSTF guidance. The latter was generated for each case by an expert panel using the clinical information and guideline documents as reference. Results Text input of de-identified data into ChatGPT4 from 505 consecutive patients undergoing colonoscopy between January 1st and April 30th, 2023 elicited a successful follow-up recommendation in 99.2% of the queries.ChatGPT4 recommendations were in closer agreement with the USMSTF Panel (85.7%) than gastroenterology practice recommendations with the USMSTF Panel (75.4%) (P<.001). Of the 14.3% discordant recommendations between ChatGPT4 and USMSTF Panel, recommendations were for later screening in 26 (5.1%) and earlier screening in 44 (8.7%) cases. The inter-rater reliability was good for ChatGPT4 vs. USMSTF Panel (Fleiss κ: 0.786, CI95%: 0.734-0.838, P<.001). Conclusions Initial real-world results suggest that ChatGPT4 can accurately define routine colonoscopy screening intervals based on verbatim input of clinical data. LLM have potential for clinical applications, but further training is needed for broad use.
更多
查看译文
关键词
ChatGPT4,Large Language Model,Colorectal Neoplasms,Artificial Intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要