Context-Enhanced Language Models for Generating Multi-Paper Citations
International Conference on Big Data Analytics(2024)
摘要
Citation text plays a pivotal role in elucidating the connection between
scientific documents, demanding an in-depth comprehension of the cited paper.
Constructing citations is often time-consuming, requiring researchers to delve
into extensive literature and grapple with articulating relevant content. To
address this challenge, the field of citation text generation (CTG) has
emerged. However, while earlier methods have primarily centered on creating
single-sentence citations, practical scenarios frequently necessitate citing
multiple papers within a single paragraph. To bridge this gap, we propose a
method that leverages Large Language Models (LLMs) to generate multi-citation
sentences. Our approach involves a single source paper and a collection of
target papers, culminating in a coherent paragraph containing multi-sentence
citation text. Furthermore, we introduce a curated dataset named MCG-S2ORC,
composed of English-language academic research papers in Computer Science,
showcasing multiple citation instances. In our experiments, we evaluate three
LLMs LLaMA, Alpaca, and Vicuna to ascertain the most effective model for this
endeavor. Additionally, we exhibit enhanced performance by integrating
knowledge graphs from target papers into the prompts for generating citation
text. This research underscores the potential of harnessing LLMs for citation
generation, opening a compelling avenue for exploring the intricate connections
between scientific documents.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要