Simple Semi-supervised Dependency Parsing

ACL(2008)

引用 568|浏览201
暂无评分
摘要
We present a simple and effective semi- supervised method for training dependency parsers. We focus on the problem of lex- ical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the ef- fectiveness of the approach in a series of de- pendency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the cluster-based features yield substantial gains in performance across a wide range of conditions. For example, in the case of English unlabeled second-order parsing, we improve from a baseline accu- racy of 92.02% to 93.16%, and in the case of Czech unlabeled second-order parsing, we improve from a baseline accuracy of 86.13% to 87.13%. In addition, we demonstrate that our method also improves performance when small amounts of training data are available, and can roughly halve the amount of super- vised data required to reach a desired level of performance.
更多
查看译文
关键词
dependency parsing,natural language processing,second order
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要