Bilingual lexicon extraction using locally weighted linear regression from comparable corpora

2015 International Conference on Asian Language Processing (IALP)(2015)

引用 1|浏览11
暂无评分
摘要
Recently a simple linear transformation with word embedding has been found to be highly effective to extract a bilingual lexicon from comparable corpora. However, it is easy to underfit for transforming all the words just using a single transformation matrix. This paper proposes a simple non-parameter based solution using locally weighted linear regression (LWR) which forces that the closer words in the training lexicon with the target word should be more important for estimating the objective function for the regression. The experimental results confirm that the proposed solution can achieve a 36.7% relative improvement at Top-1 over the baseline approach on the English-to-Chinese bilingual lexicon extraction task.
更多
查看译文
关键词
bilingual lexicon extraction,word embedding,transformation matrix,locally weighted linear regression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要