PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation

Ishan Jindal, Alexandre Rademaker,Khoi-Nguyen Tran, Huaiyu Zhu,Hiroshi Kanayama, Marina Danilevsky,Yunyao Li

17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023(2023)

引用 0|浏览13
暂无评分
摘要
Semantic role labeling (SRL) identifies predicate-argument structures in a sentence. This task is usually accomplished in four steps: predicate identification, predicate sense disambiguation, argument identification, and argument classification. Errors introduced at one step propagate to later steps. Unfortunately, the existing SRL evaluation scripts do not consider the full effect of this error propagation aspect. They either evaluate arguments independent of predicate sense (CoNLL09) or do not evaluate predicate sense at all (CoNLL05), yielding an inaccurate SRL model performance on the argument classification task. In this paper, we address key practical issues with existing evaluation scripts and propose a more strict SRL evaluation metric, PriMeSRL. We observe that by employing PriMeSRL, the quality evaluation of all SoTA SRL models drops significantly, and their relative rankings also change. We also show that PriMeSRL successfully penalizes actual failures in SoTA SRL models.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要