Evaluating a statistical CCG parser on Wikipedia

PWNLP@IJCNLP(2009)

引用 30|浏览59
暂无评分
摘要
The vast majority of parser evaluation is conducted on the 1984 Wall Street Journal (WSJ). In-domain evaluation of this kind is important for system development, but gives little indication about how the parser will perform on many practical problems. Wikipedia is an interesting domain for parsing that has so far been under-explored. We present statistical parsing results that for the first time provide information about what sort of performance a user parsing Wikipedia text can expect. We find that the C&C parser's standard model is 4.3% less accurate on Wikipedia text, but that a simple self-training exercise reduces the gap to 3.8%. The self-training also speeds up the parser on newswire text by 20%.
更多
查看译文
关键词
wall street journal,simple self-training exercise,in-domain evaluation,newswire text,statistical ccg parser,practical problem,wikipedia text,interesting domain,parser evaluation,statistical parsing result,c parser,standard model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要