Parallel edge-based sampling for static and dynamic graphs

Proceedings of the 16th ACM International Conference on Computing Frontiers(2019)

引用 3|浏览32
暂无评分
摘要
Graph sampling is an important tool to obtain small and manageable subgraphs from large real-world graphs. Prior research has shown that Induced Edge Sampling (IES) outperforms other sampling methods in terms of the quality of subgraph obtained. Even though fast sampling is crucial for several workflows, there has been little work on parallel sampling algorithms in the past. In this paper, we present parIES - a framework for parallel Induced Edge Sampling on shared-memory parallel machines. parIES, equipped with optimized load balancing and synchronization avoiding strategies, can sample both static and streaming dynamic graphs, while achieving high scalability and parallel efficiency. We develop a lightweight concurrent hash table coupled with a space-efficient dynamic graph data structure to overcome the challenges and memory constraints of sampling streaming dynamic graphs. We evaluate parIES on a 16-core (32 threads) Intel server using 7 large synthetic and real-world networks. From a static graph, parIES can sample a subgraph with > 1.4B edges in < 2.5s and achieve upto 15.5X parallel speedup. For dynamic streaming graphs, parIES can process upto 86.7M edges per second achieving 15X parallel speedup.
更多
查看译文
关键词
big data, dynamic graph data structure, induced edge sampling, parallel graph sampling, streaming graphs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要