One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations
arxiv(2024)
摘要
As Large Language Models (LLMs) are nondeterministic, the same input can
generate different outputs, some of which may be incorrect or hallucinated. If
run again, the LLM may correct itself and produce the correct answer.
Unfortunately, most LLM-powered systems resort to single results which, correct
or not, users accept. Having the LLM produce multiple outputs may help identify
disagreements or alternatives. However, it is not obvious how the user will
interpret conflicts or inconsistencies. To this end, we investigate how users
perceive the AI model and comprehend the generated information when they
receive multiple, potentially inconsistent, outputs. Through a preliminary
study, we identified five types of output inconsistencies. Based on these
categories, we conducted a study (N=252) in which participants were given one
or more LLM-generated passages to an information-seeking question. We found
that inconsistency within multiple LLM-generated outputs lowered the
participants' perceived AI capacity, while also increasing their comprehension
of the given information. Specifically, we observed that this positive effect
of inconsistencies was most significant for participants who read two passages,
compared to those who read three. Based on these findings, we present design
implications that, instead of regarding LLM output inconsistencies as a
drawback, we can reveal the potential inconsistencies to transparently indicate
the limitations of these models and promote critical LLM usage.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要