Enhancing unlexicalized parsing performance using a wide coverage lexicon, fuzzy tag-set mapping, and EM-HMM-based lexical probabilities

EACL(2009)

引用 40|浏览68
暂无评分
摘要
We present a framework for interfacing a PCFG parser with lexical information from an external resource following a different tagging scheme than the treebank. This is achieved by defining a stochastic mapping layer between the two resources. Lexical probabilities for rare events are estimated in a semi-supervised manner from a lexicon and large unannotated corpora. We show that this solution greatly enhances the performance of an unlexicalized Hebrew PCFG parser, resulting in state-of-the-art Hebrew parsing results both when a segmentation oracle is assumed, and in a real-word parsing scenario of parsing unsegmented tokens.
更多
查看译文
关键词
pcfg parser,real-word parsing scenario,lexical probability,unlexicalized parsing performance,fuzzy tag-set mapping,lexical information,rare event,state-of-the-art hebrew parsing result,different tagging scheme,em-hmm-based lexical probability,external resource,large unannotated corpus,wide coverage lexicon,unlexicalized hebrew pcfg parser
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要