Automatic acquisition of bilingual rules for extraction of bilingual word pairs from parallel corpora

DeepLA '05: Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition(2005)

引用 1|浏览16
暂无评分
摘要
In this paper, we propose a new learning method to solve the sparse data problem in automatic extraction of bilingual word pairs from parallel corpora with various languages. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any bilingual resource (e.g., a bilingual dictionary, machine translation systems) beforehand. We call this method Inductive Chain Learning (ICL). The ICL can limit the search scope for the decision of equivalents. Using ICL, the recall in three systems based on similarity measures improved respectively 8.0, 6.1 and 6.0 percentage points. In addition, the recall value of GIZA++ improved 6.6 percentage points using ICL.
更多
查看译文
关键词
parallel corpus,percentage point,sparse data problem,bilingual dictionary,bilingual resource,bilingual word pair,method Inductive Chain Learning,new learning method,recall value,automatic extraction,bilingual rule,automatic acquisition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要