Multidocument Summarization of Engineering Papers Based on Macro- and Microstructure

JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING(2011)

引用 2|浏览9
暂无评分
摘要
This paper focuses on automatic summarization of multiple engineering papers. A summarization approach based on documents' macro-and microstructure has been proposed. The macrostructure consists of a list of ranked topics from engineering papers. Topics are discovered by extracting and grouping frequently appearing word sequences into equivalence classes. Hence, the macrostructure symbolically presents the topical links in different papers. Meanwhile, the microstructure is defined as the rhetorical structure within a single paper. The identification of microstructure is approached as a classification problem. Each sentence in a paper is automatically labeled with one of the predefined rhetorical categories. Unlike existing summarization methods that first separate documents into nonoverlapping clusters and then summarize each cluster individually, our approach aims to summarize multiple documents according to the characteristics suggested at macro-and microstructure levels. The experimental study showed that our proposed approach outperformed peer systems in terms of recall-oriented understudy for gisting evaluation scores and readers' responsiveness. In an independent manual categorization task using the summaries generated by our approach and peer systems, we also performed better in terms of precision and recall. [DOI: 10.1115/1.3563048]
更多
查看译文
关键词
multidocument summarization,macrostructure,microstructure,document structure analysis,summarization evaluation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要