## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Non-projective dependency parsing using spanning tree algorithms

HLT/EMNLP, pp.523-530, (2005)

EI

Keywords

Abstract

We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n3) time. More surprisingly, the representation is extended naturally to non-projective parsing using Chu-L...More

Code:

Data:

Introduction

- Dependency parsing has seen a surge of interest lately for applications such as relation extraction (Culotta and Sorensen, 2004), machine translation (Ding and Palmer, 2005), synonym generation (Shinyama et al, 2002), and lexical resource augmentation (Snow et al, 2004).
- Root John hit the ball with the bat.
- Figure 1 shows a dependency tree for the sentence John hit the ball with the bat.
- The tree in Figure 1 is projective, meaning that if the authors put the words in their linear order, preceded by the root, the edges can be drawn above the words without crossings, or, equivalently, a word and its descendants form a contiguous substring of the sentence.
- In languages with more flexible word order than English, such as German, Dutch and Czech, non-projective dependencies are more frequent.
- Rich inflection systems reduce reliance on word order to express

Highlights

- Dependency parsing has seen a surge of interest lately for applications such as relation extraction (Culotta and Sorensen, 2004), machine translation (Ding and Palmer, 2005), synonym generation (Shinyama et al, 2002), and lexical resource augmentation (Snow et al, 2004)
- We have shown that natural language dependency parsing can be reduced to finding maximum spanning trees in directed graphs
- This reduction results from edge-based factorization and can be applied to projective languages with the Eisner parsing algorithm and non-projective languages with the Chu-Liu-Edmonds maximum spanning tree algorithm
- By viewing dependency structures as spanning trees, we have provided a general framework for parsing trees for both projective and nonprojective languages
- In particular the non-projective parsing algorithm based on the Chu-Liu-Edmonds maximum spanning trees (MSTs) algorithm provides true non-projective parsing
- Less than 2% of total edges are non-projective
- Another major improvement here is that the Chu-Liu-Edmonds non-projective MST algorithm has a parsing complexity of O(n2), versus the O(n3) complexity of the projective Eisner algorithm, which in practice leads to improvements in parsing time

Methods

- The authors performed experiments on the Czech Prague Dependency Treebank (PDT) (Hajic, 1998; Hajicet al., 2001).
- The authors used the predefined training, development and testing split of this data set.
- Czech POS tags are very complex, consisting of a series of slots that may or may not be filled with some value.
- These slots represent lexical and grammatical properties such as standard POS, case, gender, and tense.
- The number of features extracted from the PDT training set was 13, 450, 672, using the feature set outlined by McDonald et al (2005)

Results

- When the authors focus on the subset of data that only contains sentences with at least one non-projective dependency, the effect is amplified.
- Another major improvement here is that the Chu-Liu-Edmonds non-projective MST algorithm has a parsing complexity of O(n2), versus the O(n3) complexity of the projective Eisner algorithm, which in practice leads to improvements in parsing time.
- The authors should note that the results in Collins et al (1999) are different reported here due to different training and testing data sets

Conclusion

- The authors presented a general framework for parsing dependency trees based on an equivalence to maximum spanning trees in directed graphs.
- Under the framework, the authors show that the opposite is true that non-projective parsing has a lower asymptotic complexity
- Using this framework, the authors presented results showing that the non-projective model outperforms the projective model on the Prague Dependency Treebank, which contains a small number of non-projective edges.In the preceding discussion, the authors have shown that natural language dependency parsing can be reduced to finding maximum spanning trees in directed graphs.
- Non-projective parsing complexity is just O(n2), against the O(n3) complexity of the Eisner dynamic programming algorithm, which by construction enforces the non-crossing constraint

- Table1: Dependency parsing results for Czech. Czech-B is the subset of Czech-A containing only sentences with at least one non-projective dependency
- Table2: Dependency parsing results for English using spanning tree algorithms

Funding

- This work has been supported by NSF ITR grants 0205448 and 0428193

Reference

- Y.J. Chu and T.H. Liu. 1965. On the shortest arborescence of a directed graph. Science Sinica, 14:1396– 1400.
- M. Collins, J. Hajic, L. Ramshaw, and C. Tillmann. 1999. A statistical parser for Czech. In Proc. ACL.
- M. Collins. 2002. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proc. EMNLP.
- T.H. Cormen, C.E. Leiserson, and R.L. Rivest. 1990. Introduction to Algorithms. MIT Press/McGraw-Hill.
- K. Crammer and Y. Singer. 2003. Ultraconservative online algorithms for multiclass problems. JMLR.
- K. Crammer, O. Dekel, S. Shalev-Shwartz, and Y. Singer. 2003. Online passive aggressive algorithms. In Proc. NIPS.
- K. Crammer, R. McDonald, and F. Pereira. 2004. New large margin algorithms for structured prediction. In Learning with Structured Outputs Workshop (NIPS).
- A. Culotta and J. Sorensen. 2004. Dependency tree kernels for relation extraction. In Proc. ACL.
- Y. Ding and M. Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proc. ACL.
- J. Edmonds. 1967. Optimum branchings. Journal of Research of the National Bureau of Standards, 71B:233– 240.
- J. Eisner. 1996. Three new probabilistic models for dependency parsing: An exploration. In Proc. COLING.
- J. Hajic, E. Hajicova, P. Pajas, J. Panevova, P. Sgall, and B. Vidova Hladka. 2001. The Prague Dependency Treebank 1.0 CDROM. Linguistics Data Consortium Cat. No. LDC2001T10.
- J. Hajic. 1998. Building a syntactically annotated corpus: The Prague dependency treebank. Issues of Valency and Meaning, pages 106–132.
- H. Hirakawa. 2001. Semantic dependency analysis method for Japanese based on optimum tree search algorithm. In Proc. of PACLING.
- Klaus-U. Hoffgen. 1993. Learning and robust learning of product distributions. In Proceedings of COLT’93, pages 77–83.
- W. Hou. 1996. Algorithm for finding the first k shortest arborescences of a digraph. Mathematica Applicata, 9(1):1–4.
- R. Hudson. 1984. Word Grammar. Blackwell.
- G. Leonidas. 2003. Arborescence optimization problems solvable by Edmonds’ algorithm. Theoretical Computer Science, 301:427 – 437.
- M. Marcus, B. Santorini, and M. Marcinkiewicz. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19(2):313–330.
- R. McDonald, K. Crammer, and F. Pereira. 2005. Online large-margin training of dependency parsers. In Proc. ACL.
- J. Nivre and J. Nilsson. 2005. Pseudo-projective dependency parsing. In Proc. ACL.
- J. Nivre and M. Scholz. 2004. Deterministic dependency parsing of english text. In Proc. COLING.
- Y. Shinyama, S. Sekine, K. Sudo, and R. Grishman. 2002. Automatic paraphrase acquisition from news articles. In Proc. HLT.
- R. Snow, D. Jurafsky, and A. Y. Ng. 2004. Learning syntactic patterns for automatic hypernym discovery. In NIPS 2004.
- R.E. Tarjan. 1977. Finding optimum branchings. Networks, 7:25–35.
- B. Taskar, C. Guestrin, and D. Koller. 2003. Max-margin Markov networks. In Proc. NIPS.
- B. Taskar, D. Klein, M. Collins, D. Koller, and C. Manning. 2004. Max-margin parsing. In Proc. EMNLP.
- W. Wang and M. P. Harper. 2004. A statistical constraint dependency grammar (CDG) parser. In Workshop on Incremental Parsing: Bringing Engineering and Cognition Together (ACL).
- H. Yamada and Y. Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proc. IWPT.
- D. Zeman. 2004. Parsing with a Statistical Dependency Model. Ph.D. thesis, Univerzita Karlova, Praha.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn