Vietnamese Text Summarization Based on Elementary Discourse Units.

Khang Nhut Lam, Tai Ngoc Nguyen,Jugal Kalita

NLPIR(2022)

引用 0|浏览3
暂无评分
摘要
This paper presents text summarization models based on elementary discourse units (EDUs) to construct extractive and abstractive summarization for Vietnamese documents. First, we introduce algorithms using the POS information for constructing EDUs in Vietnamese. Then, the EDUs created are fed into an extractive summarization model using a pointer network and an abstractive summarization model using a pointer generator model. A reinforcement learning method is used to improve the quality of the models. We perform experiments on the CTUNLPSUM dataset, including 1,053,702 Vietnamese documents extracted from online magazines. The extractive summarization models based on EDUs outperform other extractive summarization models based on words or sentences. The ROUGE-1, ROUGE-2, and ROUGE-L of the best extractive and abstractive summarization models are 0.567, 0.241, 0.461; and 0.530, 0.213, 0.394, respectively.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要