ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-university level: A pattern of responses of generative artificial intelligence or large language models

Current Research in Biotechnology(2024)

引用 0|浏览3
暂无评分
摘要
Recently, researchers have shown concern about the ChatGPT-derived answers. Here, we conducted a series of tests using ChatGPT at 11 universities worldwide to understand the pattern of its answer accuracy, reproducibility, answer length, plagiarism, and in-depth using two questionnaires (the first set with 15 MCQs and the second 15 KBQ). Among 15 MCQ-generated answers, 13 ± 70 were correct (Median 82.5; Coefficient variance 4.85), 3 ± 0.77 were incorrect (Median: 3, Coefficient variance: 25.81), and 1 to 10 were reproducible, and 11 to 15 were not. Among 15 KBQ, the length of each question (in words) is about 294.5 ± 97.60 (range varies from 138.7 to 438.09), and the mean similarity index (in words) is about 29.53 ± 11.40 (Coefficient variance: 38.62) for each question. The statistical models were also developed using analyzed parameters of answers. The study shows a pattern of ChatGPT-derive answers with correctness and incorrectness and urges urgency for an error-free, next-generation LLM to avoid users’ misguidance.
更多
查看译文
关键词
ChatGPT,Accuracy,Reproducibility,Plagiarism,Answer length
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要