Enhanced sentence representation for extractive text summarization: Investigating the syntactic and semantic features and their contribution to sentence scoring.

Expert Syst. Appl.(2023)

引用 0|浏览25
暂无评分
摘要
The primary challenge faced in extractive text summarization is related to the scoring of sentences, with the critical factor for scoring being the manner in which the sentence representation is conducted. This study aims to investigate this hypothesis and to perform a detailed analysis of the impact of sentence representation techniques that have been used both semantically and syntactically. The study initially evaluated the empirical impact of individual syntactic and semantic features on the accuracy of summarization. To examine syntactic usage, a comprehensive list of 40 syntactic features was developed, while semantic representation was accomplished using sentence embeddings. Subsequently, an improved feature set was proposed that jointly utilizes syntactic and semantic features. To assess the impact of this feature set on the resulting summaries, the proposed sentence representation was tested on three distinct summarization corpora consisting of lengthy scientific documents across diverse domains. The assessment of summary evaluation and classification performance evaluation metrics was conducted to evaluate the quality of the resulting summaries. The findings of the experiments indicated that the summaries generated by the proposed feature set performed better than not only those obtained using individual features but even summaries produced by state-of-the-art methods.
更多
查看译文
关键词
extractive text summarization,sentence representation,semantic features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要