Improving Unsupervised Extractive Summarization by Jointly Modeling Facet and Redundancy

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING(2022)

引用 5|浏览42
暂无评分
摘要
Unsupervised extractive summarization aims to extract salient sentences from documents without labeled corpus. Existing methods are mostly graph-based by computing sentence centrality. These methods have two main problems: facet bias and redundant problems. Facet bias problem leads summarization models tend to select sentences within the same facet, which often leads to the ignoring of other vital facets, especially on long-document and multi-documents. First, to address the facet bias problem, we proposed a novel Facet-Aware centrality-based Ranking model (FAR). We let the model pay more attention to different facets by introducing a sentence-document weight. The weight is added to the sentence centrality score. FAR can alleviate redundancy to some extent. Then, to further reduce redundancy, we proposed a novel Redundancy- and Facet-Aware Ranking model (RFAR) which jointly models facet and redundancy by incorporating Determinantal Point Process (DPP) into the previous proposed FAR. We evaluate our FAR and RFAR on a wide range of summarization tasks that include 8 representative benchmark datasets. Experimental results show that FAR and RFAR consistently outperforms strong baselines, especially in long- and multi-document scenarios, and even perform comparably to some supervised models. Besides, we find that our methods can alleviate the position bias problem.
更多
查看译文
关键词
Redundancy, Mathematical models, Computational modeling, Speech processing, Task analysis, Visualization, Training data, Determinantal point process, facet-bias problem, redundancy, unsupervised extractive summarization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要