Pronominal anaphora resolution in chinese

Pronominal anaphora resolution in chinese(2006)

引用 48|浏览7
暂无评分
摘要
Resolving pronominal anaphors in English has been a focus of research in natural language processing for decades. Methods ranging from linguistics-oriented, rule-based approaches to data-oriented, machine-learning approaches have been applied to the problem of finding the antecedents of pronouns. In contrast to the abundance of research in English, there is almost no work on the problem in Chinese. This thesis addresses that gap. Both a rule-based and a machine-learning anaphora resolution approach are presented in this work. An important difference between Chinese and English is that Chinese, unlike English, is a pro-drop language, and has null (zero) pronouns. The rule-based approach is applied to resolving these null pronouns as well as to the overt, third-person pronouns. The Hobbs algorithm is used for the rule-based method of anaphora resolution. Three versions of the algorithm are presented. The first uses only syntactic structure to select an antecedent. The second uses limited number and gender agreement, while the third incorporates semantic constraints on the proposed antecedents. For the machine-learning method, maximum entropy, supervised machine-learning models are used. Different models were trained using sets of features that paralleled the information sources used by the different versions of the Hobbs algorithm. Two sets of data were used. The Penn Chinese Treebank provided the test data for resolution of both overt, third-person pronouns and of zero pronouns. The CTB parses were annotated for coreference using guidelines that were drawn up for the work presented here. Data annotated for the 2004 Chinese ACE program were used for training and testing the maximum entropy models to find the antecedents for overt, third-person pronouns. The results from experiments with the two basic methods using the different levels of linguistic information will be presented and discussed.
更多
查看译文
关键词
rule-based approach,pronominal anaphora resolution,Chinese ACE program,machine-learning anaphora resolution approach,Penn Chinese Treebank,machine-learning method,machine-learning approach,Hobbs algorithm,supervised machine-learning model,third-person pronoun,rule-based method
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要