Evaluation of AI-generated Responses by Different Artificial Intelligence Chatbots to the Clinical Decision-Making Case-Based Questions in Oral and Maxillofacial Surgery

Ali Azadi, Fatemeh Gorjinejad, Hosein Mohammad-Rahimi,Reza Tabrizi,Mostafa Alam,Mohsen Golkar

Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology(2024)

引用 0|浏览0
暂无评分
摘要
Objectives This study aims to evaluate the correctness of the generated answers by Google Bard, GPT-3.5, GPT-4, Claude-Instant, and Bing chatbots to decision-making clinical questions in the oral and maxillofacial surgery (OMFS) area. Study Design A group of three board-certified oral and maxillofacial surgeons designed a questionnaire with 50 case-based questions in multiple-choice and open-ended formats. Answers of chatbots to multiple-choice questions were examined against the chosen option by three referees. The chatbots' answers to the open-ended questions were evaluated based on the modified global quality scale. A p-value under 0.05 was considered significant. Results Bard, GPT-3.5, GPT-4, Claude-Instant, and Bing answered 34%, 36%, 38%, 38%, and 26% of the questions correctly, respectively. In open-ended questions, GPT-4 scored the most answers evaluated as grades “4” or “5,” and Bing scored the most answers evaluated as grades “1” or “2”. There were no statistically significant differences between the five chatbots in responding to the open-ended (P = 0.275) and multiple-choice (P = 0.699) questions. Conclusion Considering the major inaccuracies in the responses of chatbots, despite their relatively good performance in answering open-ended questions, this technology yet cannot be trusted as a consultant for clinicians in decision-making situations. Clinical Significance These results can affect the way that OMFS is practiced. By providing clinicians with access to AI-generated responses to clinical decision-making questions, we can help them make more informed and accurate decisions, which can lead to better patient outcomes.
更多
查看译文
关键词
Artificial Intelligence,Dentistry,Machine Learning,Oral and Maxillofacial Surgeons,Oral Surgery
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要