The Generative Pre-trained Transformer (GPT) models, renowned for g"/>

A Culturally Sensitive Test to Evaluate Nuanced GPT Hallucination

IEEE transactions on artificial intelligence(2023)

引用 0|浏览0
暂无评分
摘要
The Generative Pre-trained Transformer (GPT) models, renowned for generating human-like text, occasionally produce “hallucinations” - outputs that diverge from human expectations. Current mitigation strategies for these GPT hallucinations largely rely on algorithmic automation, thereby overlooking the complexities of human judgment and cultural influence, particularly in fact interpretation. Addressing this issue, we have introduced a Culturally Sensitive Test that integrates language subjectivity, cultural nuances, and GPT idiosyncrasies. We have applied this test to five GPT models—OpenAI’s ChatGPT-3.5 and ChatGPT-4, Google’s Bard, Perplexity AI and TruthGPT - evaluating their responses to 70 questions across seven categories designed to provoke hallucinations. The evaluated models demonstrated varying performance, with controversial topics, those lacking clear scientific consensus and the brain teasers proving more susceptible to GPT hallucinations. Our study has paved the way for a nuanced assessment of GPT hallucinations.
更多
查看译文
关键词
Generative AI,Culturally Sensitive Test,AI Hallucinations,Generative Pre-trained Transformer,AI Evaluation,Cultural Nuance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要