Scalable Overlapping Community Detection

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2016)

引用 8|浏览110
暂无评分
摘要
Recent advancements in machine learning algorithms have transformed the data analytics domain and provided innovative solutions to inherently difficult problems. However, training models at scale over large data sets remains a daunting challenge. One such problem is the detection of overlapping communities within graphs. For example, a social network can be modeled as a graph where the vertices and edges represent individuals and their relationships. As opposed to the problem of graph partitioning or clustering, an individual can be part of multiple communities which significantly increases the problem complexity. In this paper, we present and evaluate an efficient parallel and distributed implementation of a Stochastic Gradient Markov Chain Monte Carlo algorithm that solves the overlapping community detection problem. We show that the algorithm can scale and process graphs consisting of billions of edges and tens of millions of vertices on a compute cluster of 65 nodes. To the best of our knowledge, this is the first time that the problem of deducing overlapping communities has been learned for problems of such a large scale.
更多
查看译文
关键词
Distributed computing,Parallel programming,High performance computing,Performance analysis,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要