Person Name Disambiguation In Web Pages Using Social Network, Compound Words And Latent Topics

PAKDD'08: Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining(2008)

引用 8|浏览7
暂无评分
摘要
The World Wide Web (WWW) provides much information about persons, and in recent years WWW search engines have been commonly used for learning about persons. However, many persons have the same name and that ambiguity typically causes the search results of one person name to include Web pages about several different persons. We propose a novel framework for person name disambiguation that has the following three components processes. Extraction of social network information by finding co-occurrences of named entities, Measurement of document similarities based on occurrences of key compound words, Inference of topic information from documents based on the Dirichlet process unigram mixture model. Experiments using an actual Web document dataset show that the result of our framework is promising.
更多
查看译文
关键词
person name disambiguation,web people search,clustering,social network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要