Resolving surface forms to Wikipedia topics

COLING(2010)

引用 63|浏览116
暂无评分
摘要
Ambiguity of entity mentions and concept references is a challenge to mining text beyond surface-level keywords. We describe an effective method of disambiguating surface forms and resolving them to Wikipedia entities and concepts. Our method employs an extensive set of features mined from Wikipedia and other large data sources, and combines the features using a machine learning approach with automatically generated training data. Based on a manually labeled evaluation set containing over 1000 news articles, our resolution model has 85% precision and 87.8% recall. The performance is significantly better than three baselines based on traditional context similarities or sense commonness measurements. Our method can be applied to other languages and scales well to new entities and concepts.
更多
查看译文
关键词
concept reference,training data,disambiguating surface form,extensive set,effective method,mining text,new entity,wikipedia entity,news article,large data source,wikipedia topic,resolving surface form
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要