Facilitating the analysis of discourse phenomena in an interoperable NLP platform
CICLing (1)(2013)
摘要
The analysis of discourse phenomena is essential in many natural language processing (NLP) applications. The growing diversity of available corpora and NLP tools brings a multitude of representation formats. In order to alleviate the problem of incompatible formats when constructing complex text mining pipelines, the Unstructured Information Management Architecture (UIMA) provides a standard means of communication between tools and resources. U-Compare, a text mining workflow construction platform based on UIMA, further enhances interoperability through a shared system of data types, allowing free combination of compliant components into workflows. Although U-Compare and its type system already support syntactic and semantic analyses, support for the analysis of discourse phenomena was previously lacking. In response, we have extended the U-Compare type system with new discourse-level types. We illustrate processing and visualisation of discourse information in U-Compare by providing several new deserialisation components for corpora containing discourse annotations. The new U-Compare is downloadable from http://nactem.ac.uk/ucompare.
更多查看译文
关键词
discourse information,u-compare type system,new deserialisation component,new u-compare,nlp tool,new discourse-level type,discourse annotation,shared system,discourse phenomenon,interoperable nlp platform,type system,coreference,causality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要