Generalizing the semantic roles in the Chinese Proposition Bank

Language Resources and Evaluation(2016)

引用 3|浏览31
暂无评分
摘要
The Chinese Proposition Bank (CPB) is a corpus annotated with semantic roles for the arguments of verbal and nominalized predicates. The semantic roles for the core arguments are defined in a predicate-specific manner. That is, a set of semantic roles, numerically identified, are defined for each sense of a predicate lemma and recorded in a valency lexicon called frame files . The predicate-specific manner in which the semantic roles are defined reduces the cognitive burden on the annotators since they only need to internalize a few roles at a time and this has contributed to the consistency in annotation. It was also a sensible approach given the contentious issue of how many semantic roles are needed if one were to adopt of set of global semantic roles that apply to all predicates. A downside of this approach, however, is that the predicate-specific roles may not be consistent across predicates, and this inconsistency has a negative impact on training automatic systems. Given the progress that has been made in defining semantic roles in the last decade or so, time is ripe for adopting a set of general semantic roles. In this article, we describe our effort to “re-annotate” the CPB with a set of “global” semantic roles that are predicate-independent and investigate their impact on automatic semantic role labeling systems. When defining these global semantic roles, we strive to make them compatible with a recently published ISO standards on the annotation of semantic roles (ISO 24617-4:2014 SemAF-SR) while taking the linguistic characteristics of the Chinese language into account. We show that in spite of the much larger number of global semantic roles, the accuracy of an off-the-shelf semantic role labeling system retrained on the data re-annotated with global semantic roles is comparable to that trained on the data set with the original predicate-specific semantic roles. We also argue that the re-annotated data set, together with the original data, provides the user with more flexibility when using the corpus.
更多
查看译文
关键词
Semantic role,Predicate-argument structure,Chinese Proposition Bank,Semantic role labeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要