A Survey Of Person Name Disambiguation On The Web

IEEE ACCESS(2018)

引用 2|浏览31
暂无评分
摘要
Person name disambiguation on the Web (PNDW) consists of grouping the Web pages retrieved by a search engine when a person's name is queried according to the individuals they refer to. This problem is of interest to the research community because Internet users often search for information about people on search engines, and also because people's names are a very ambiguous type of named entity. In addition, the Web domain presents several challenges for natural language processing and information retrieval methods. In this paper, we classify PNDW systems according to their main characteristics: 1) features used to identify different individuals with the same name; 2) mathematical models used to represent the search results; 3) clustering algorithms used to group the Web pages; 4) methods used to address the impact of Web pages from social networking sites; and 5) methods used to deal with the multilingual nature of the Web. Also, we present the data sets most widely used to evaluate PNDW systems. Finally, we analyze the results obtained by the best PNDW systems in the literature.
更多
查看译文
关键词
Document clustering, person name disambiguation, search engines, Web people search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要