Towards a fair comparison between name disambiguation approaches
OAIR(2013)
摘要
Searching for information about people in search engines is a common and straightforward task that is often hampered by name ambiguities. While users are interested in information about a single person, results pages usually comprise many persons with the same name. There are several approaches to tackle personal name disambiguation; however, it is still a challenge to understand the impact of each approach alone. In this paper, we present a plugin-based framework that aims to compare and to identify the most promising approaches for name disambiguation. This framework enabled us to merge different approaches to find good combinations for this task and to compare state-of-the-art solutions using a common dataset. Preliminary results support the greater impact of biographical information to aid in clustering, the use of comprehensive texts instead of only metadata and TF-IDF instead of more complex approaches.
更多查看译文
关键词
plugin-based framework,common dataset,results page,name disambiguation approach,fair comparison,straightforward task,name disambiguation,complex approach,biographical information,name ambiguity,greater impact,personal name disambiguation,clustering,feature selection,vector space model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络