Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts for Open-Domain QA?
CoRR(2024)
摘要
While auxiliary information has become a key to enhance Large Language Models
(LLMs), relatively little is known about how well LLMs merge these contexts,
specifically generated and retrieved. To study this, we formulate a task
specifically designed to identify whether the answers, derived from the
integration of generated and retrieved contexts, are attributed to either
generated or retrieved contexts. To support this task, we develop a methodology
to construct datasets with conflicting contexts, where each question is paired
with both generated and retrieved contexts, yet only one of them contains the
correct answer. Our experiments reveal a significant bias in LLMs towards
generated contexts, as evidenced across state-of-the-art open (Llama2-7b/13b)
and closed (GPT 3.5/4) systems. We further identify two key factors
contributing to this bias: i) Contexts generated by LLMs typically show greater
similarity to the questions, increasing their likelihood of selection; ii) The
segmentation process used in retrieved contexts disrupts their completeness,
thereby hindering their full utilization in LLMs. Our analysis enhances the
understanding of how LLMs merge diverse contexts, offering valuable insights
for advancing current augmentation methods for LLMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要