Building A Cross-Language Entity Linking Collection In Twenty-One Languages
CLEF'11: Proceedings of the Second international conference on Multilingual and multimodal information access evaluation(2011)
摘要
We describe an efficient way to create a test collection for evaluating the accuracy of cross-language entity linking. Queries are created by semiautomatically identifying person names on the English side of a parallel corpus, using judgments obtained through crowdsourcing to identify the entity corresponding to the name, and projecting the English name onto the non-English document using word alignments. We applied the technique to produce the first publicly available multilingual cross-language entity linking collection. The collection includes approximately 55,000 queries, comprising between 875 and 4,329 queries for each of twenty-one non-English languages.
更多查看译文
关键词
Entity Linking,Cross-Language Entity Linking,Multilingual Corpora,Crowdsourcing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络