Chapter 1: Lexicalized PCFG: Parsing Czech

msra(2007)

引用 23|浏览17
暂无评分
摘要
Recent work in statistical parsing of English has used lexicalized trees as a representation, and has exploited parameterizations that lead to probabilities directly associated with ependencies between pairs of words in the tree structure. Parsed corpora such as the Penn treebank have generally been sets of sentence/tree pairs: typically, hand-coded rules are used to assign head-words to each constituent in the tree, and the dependency structures are then implicit in the tree. In Czech we have dependency annotations, but no tree structures. For parsing Czech we considered a strategy of converting dependency structures in training data to lexicalized trees, then running the parsing algorithms originally developed for English. A few notes about this mapping between trees and dependencies:
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要