ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation.

HLT '11: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2(2011)

引用 60|浏览24
暂无评分
摘要
This paper describes a set of exploratory experiments for a multilingual classification-based approach to Word Sense Disambiguation. Instead of using a predefined monolingual sense-inventory such as WordNet, we use a language-independent framework where the word senses are derived automatically from word alignments on a parallel corpus. We built five classifiers with English as an input language and translations in the five supported languages (viz. French, Dutch, Italian, Spanish and German) as classification output. The feature vectors incorporate both the more traditional local context features, as well as binary bag-of-words features that are extracted from the aligned translations. Our results show that the ParaSense multilingual WSD system shows very competitive results compared to the best systems that were evaluated on the SemEval-2010 Cross-Lingual Word Sense Disambiguation task for all five target languages.
更多
查看译文
关键词
Word Sense,Disambiguation task,SemEval-2010 Cross-Lingual Word Sense,multilingual classification-based approach,word alignment,best system,binary bag-of-words feature,classification output,competitive result,exploratory experiment,parallel corpus,word sense disambiguation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要