Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia.

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1(2012)

引用 92|浏览76
暂无评分
摘要
In this paper we propose a method to automatically label multi-lingual data with named entity tags. We build on prior work utilizing Wikipedia metadata and show how to effectively combine the weak annotations stemming from Wikipedia metadata with information obtained through English-foreign language parallel Wikipedia sentences. The combination is achieved using a novel semi-CRF model for foreign sentence tagging in the context of a parallel English sentence. The model outperforms both standard annotation projection methods and methods based solely on Wikipedia metadata.
更多
查看译文
关键词
Wikipedia metadata,Wikipedia sentence,foreign sentence,novel semi-CRF model,parallel English sentence,English-foreign language,entity tag,multi-lingual data,prior work,standard annotation projection method,entity recognition,parallel data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要