Predicting and Identifying Missing Node Information in Social Networks

ACM Transactions on Knowledge Discovery From Data(2013)

引用 22|浏览14
暂无评分
摘要
In recent years, social networks have surged in popularity. One key aspect of social network research is identifying important missing information that is not explicitly represented in the network, or is not visible to all. To date, this line of research typically focused on finding the connections that are missing between nodes, a challenge typically termed as the link prediction problem. This article introduces the missing node identification problem, where missing members in the social network structure must be identified. In this problem, indications of missing nodes are assumed to exist. Given these indications and a partial network, we must assess which indications originate from the same missing node and determine the full network structure. Toward solving this problem, we present the missing node identification by spectral clustering algorithm (MISC), an approach based on a spectral clustering algorithm, combined with nodes’ pairwise affinity measures that were adopted from link prediction research. We evaluate the performance of our approach in different problem settings and scenarios, using real-life data from Facebook. The results show that our approach has beneficial results and can be effective in solving the missing node identification problem. In addition, this article also presents R-MISC, which uses a sparse matrix representation, efficient algorithms for calculating the nodes’ pairwise affinity, and a proprietary dimension reduction technique to enable scaling the MISC algorithm to large networks of more than 100,000 nodes. Last, we consider problem settings where some of the indications are unknown. Two algorithms are suggested for this problem: speculative MISC, based on MISC, and missing link completion, based on classical link prediction literature. We show that speculative MISC outperforms missing link completion.
更多
查看译文
关键词
algorithms,data mining,missing nodes,performance,social networks,spectral clustering,theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要