SKT5SciSumm – A Hybrid Generative Approach for Multi-Document Scientific Summarization
CoRR(2024)
摘要
Summarization for scientific text has shown significant benefits both for the
research community and human society. Given the fact that the nature of
scientific text is distinctive and the input of the multi-document
summarization task is substantially long, the task requires sufficient
embedding generation and text truncation without losing important information.
To tackle these issues, in this paper, we propose SKT5SciSumm - a hybrid
framework for multi-document scientific summarization (MDSS). We leverage the
Sentence-Transformer version of Scientific Paper Embeddings using
Citation-Informed Transformers (SPECTER) to encode and represent textual
sentences, allowing for efficient extractive summarization using k-means
clustering. We employ the T5 family of models to generate abstractive summaries
using extracted sentences. SKT5SciSumm achieves state-of-the-art performance on
the Multi-XScience dataset. Through extensive experiments and evaluation, we
showcase the benefits of our model by using less complicated models to achieve
remarkable results, thereby highlighting its potential in advancing the field
of multi-document summarization for scientific text.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要