Textual entailment as an evaluation metric for abstractive text summarization

Swagat Shubham Bhuyan, Saranga Kingkor Mahanta,Partha Pakray,Benoit Favre

Natural Language Processing Journal(2023)

引用 0|浏览5
暂无评分
摘要
Automated text summarization systems require to be heedful of the reader and the communication goals since it may be the determining component of whether the original textual content is actually worth reading in full. The summary can also assist enhance document indexing for information retrieval, and it is generally much less biased than a human-written summary. A crucial part while building intelligent systems is evaluating them. Consequently, the choice of evaluation metric(s) is of utmost importance. Current standard evaluation metrics like BLEU and ROUGE, although fairly effective for evaluation of extractive text summarization systems, become futile when it comes to comparing semantic information between two texts, i.e in abstractive summarization. We propose textual entailment as a potential metric to evaluate abstractive summaries. The results show the contribution of text entailment as a strong automated evaluation model for such summaries. The textual entailment scores between the text and generated summaries, and between the reference and predicted summaries were calculated, and an overall summarizer score was generated to give a fair idea of how efficient the generated summaries are. We put forward some novel methods that use the entailment scores and the final summarizer scores for a reasonable evaluation of the same across various scenarios. A Final Entailment Metric Score (FEMS) was generated to get an insightful idea in order to compare both the generated summaries.
更多
查看译文
关键词
textual entailment,evaluation metric
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要