Ranking-based name matching for author disambiguation in bibliographic data.
KDD' 13: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Chicago Illinois August, 2013(2013)
摘要
Author name ambiguity is a frequently encountered problem in digital publication libraries such as Microsoft Academic Search. The cause of this problem mostly is that different authors may publish under the same name, while the same author could publish under various names due to abbreviations, nicknames, etc. Author disambiguation is exactly the goal of the Track II of KDD Cup Data Mining Contest 2013. In this paper we introduce our ranking-based name matching algorithm and system called RankMatch. One important feature of our solution is using heterogeneous meta-paths to evaluate the similarity between two potential duplicate authors whose names are compatible. We participated under team name "SmallData" and our final solution achieved a Mean F1 score of 99.157%, ranking in the second place in the contest.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络