Topic-driven Distant Supervision Framework for Macro-level Discourse Parsing

CoRR(2023)

引用 0|浏览57
暂无评分
摘要
Discourse parsing, the task of analyzing the internal rhetorical structure of texts, is a challenging problem in natural language processing. Despite the recent advances in neural models, the lack of large-scale, high-quality corpora for training remains a major obstacle. Recent studies have attempted to overcome this limitation by using distant supervision, which utilizes results from other NLP tasks (e.g., sentiment polarity, attention matrix, and segmentation probability) to parse discourse trees. However, these methods do not take into account the differences between in-domain and out-of-domain tasks, resulting in lower performance and inability to leverage the high-quality in-domain data for further improvement. To address these issues, we propose a distant supervision framework that leverages the relations between topic structure and rhetorical structure. Specifically, we propose two distantly supervised methods, based on transfer learning and the teacher-student model, that narrow the gap between in-domain and out-of-domain tasks through label mapping and oracle annotation. Experimental results on the MCDTB and RST-DT datasets show that our methods achieve the best performance in both distant-supervised and supervised scenarios.
更多
查看译文
关键词
distant supervision framework,discourse
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要