Unsupervised abstractive summarization via sentence rewriting

Computer Speech & Language(2022)

引用 2|浏览71
暂无评分
摘要
Unsupervised extractive summarization aims to extract salient sentences from the document without labeled corpus. Existing methods have achieved promising progress, thanks to the power of large-scale pre-trained language models and high-quality contextualized representations. However, extractive summaries often fail to maintain smooth transitions between sentences and struggle to form a coherent and fluent text due to splicing of sentences. Nevertheless, to the best of our knowledge, very few studies currently focus on unsupervised abstractive summarization. Inspired by the intuitive human process of writing summaries, which involves extracting salient sentences first and then reconstructing them, in this paper, we propose an Extract-then-Abstract framework to generate more coherent and human-like summary. Specifically, we first adopt extractive summarization model as summarizer to generate extractive summary in the extraction stage. Then in the abstraction stage, we propose a BART-based sentence write model to generate more coherent and fluent abstractive summary. To this end, we design a novel parallel data creation method for our rewrite model by proposing an effective sentence sampling strategy without any manual annotation cost. Extensive experiments including automatic evaluation and human evaluation demonstrate that our framework consistently outperforms strong baselines for unsupervised abstractive summarization and can generate more coherent and human-like summary while maintaining in competitive ROUGE scores for unsupervised extractive summarization.
更多
查看译文
关键词
Unsupervised abstractive summarization,Sentence rewrite model,Pre-trained language model,Coherence text
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要