Towards a Robust Retrieval-Based Summarization System
arxiv(2024)
摘要
This paper describes an investigation of the robustness of large language
models (LLMs) for retrieval augmented generation (RAG)-based summarization
tasks. While LLMs provide summarization capabilities, their performance in
complex, real-world scenarios remains under-explored. Our first contribution is
LogicSumm, an innovative evaluation framework incorporating realistic scenarios
to assess LLM robustness during RAG-based summarization. Based on limitations
identified by LogiSumm, we then developed SummRAG, a comprehensive system to
create training dialogues and fine-tune a model to enhance robustness within
LogicSumm's scenarios. SummRAG is an example of our goal of defining structured
methods to test the capabilities of an LLM, rather than addressing issues in a
one-off fashion. Experimental results confirm the power of SummRAG, showcasing
improved logical coherence and summarization quality. Data, corresponding model
weights, and Python code are available online.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要