A Unified Document-Level Chinese Discourse Parser on Different Granularity Levels.

ICDAR (1)(2023)

引用 0|浏览21
暂无评分
摘要
Discourse parsing aims to comprehend the structure and semantics of a document. Some previous studies have taken multiple levels of granularity methods to parse documents while disregarding the connection between granularity levels. Additionally, almost all the Chinese discourse parsing approaches concentrated on a single granularity due to lacking annotated corpora. To address the above issues, we propose a unified document-level Chinese discourse parser based on multi-granularity levels, which leverages granularity connections between paragraphs and Elementary Discourse Units (EDUs) in a document. Specifically, we first identify EDU-level discourse trees and then introduce a structural encoding module to capture EDU-level structural and semantic information. It can significantly promote the construction of paragraph-level discourse trees. Moreover, we construct the Unified Chinese Discourse TreeBank (UCDTB), which includes 467 articles with annotations from clauses to the whole article, filling the gap in existing unified corpus resources on Chinese discourse parsing. The experiments on both Chinese UCDTB and English RST-DT show that our model outperforms the SOTA baselines.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要