Improving Attributed Text Generation of Large Language Models via Preference Learning
arxiv(2024)
摘要
Large language models have been widely adopted in natural language
processing, yet they face the challenge of generating unreliable content.
Recent works aim to reduce misinformation and hallucinations by resorting to
attribution as a means to provide evidence (i.e., citations). However, current
attribution methods usually focus on the retrieval stage and automatic
evaluation that neglect mirroring the citation mechanisms in human scholarly
writing to bolster credibility. In this paper, we address these challenges by
modelling the attribution task as preference learning and introducing an
Automatic Preference Optimization (APO) framework. First, we create a curated
collection for post-training with 6,330 examples by collecting and filtering
from existing datasets. Second, considering the high cost of labelling
preference data, we further propose an automatic method to synthesize
attribution preference data resulting in 95,263 pairs. Moreover, inspired by
the human citation process, we further propose a progressive preference
optimization method by leveraging fine-grained information. Extensive
experiments on three datasets (i.e., ASQA, StrategyQA, and ELI5) demonstrate
that APO achieves state-of-the-art citation F1 with higher answer quality.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要