Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition

ENTROPY(2023)

引用 0|浏览5
暂无评分
摘要
Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the semantic completeness before any attempt to predict the discourse relation. However, word level embedding, widely used in existing works, may lead to a loss of semantics by splitting some phrases that should be treated as complete semantic units. In this article, we proposed three methods to segment a sentence into complete semantic units: a corpus-based method to serve as the baseline, a constituent parsing tree-based method, and a dependency parsing tree-based method to provide a more flexible and automatic way to divide the sentence. The segmented sentence will then be embedded at the level of semantic units so the embeddings could be fed into the IDRR networks and play the same role as word embeddings. We implemented our methods into one of the recent IDRR models to compare the performance with the original version using word level embeddings. Results show that proper embedding level better conserves the semantic information in the sentence and helps to enhance the performance of IDRR models.
更多
查看译文
关键词
implicit discourse relation recognition,shallow discourse parsing,relation extraction,phrase extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要