Corpus-Steered Query Expansion with Large Language Models
Conference of the European Chapter of the Association for Computational Linguistics(2024)
摘要
Recent studies demonstrate that query expansions generated by large language
models (LLMs) can considerably enhance information retrieval systems by
generating hypothetical documents that answer the queries as expansions.
However, challenges arise from misalignments between the expansions and the
retrieval corpus, resulting in issues like hallucinations and outdated
information due to the limited intrinsic knowledge of LLMs. Inspired by Pseudo
Relevance Feedback (PRF), we introduce Corpus-Steered Query Expansion (CSQE) to
promote the incorporation of knowledge embedded within the corpus. CSQE
utilizes the relevance assessing capability of LLMs to systematically identify
pivotal sentences in the initially-retrieved documents. These corpus-originated
texts are subsequently used to expand the query together with LLM-knowledge
empowered expansions, improving the relevance prediction between the query and
the target documents. Extensive experiments reveal that CSQE exhibits strong
performance without necessitating any training, especially with queries for
which LLMs lack knowledge.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要