Scaling Graph 500 SSSP to 140 Trillion Edges with over 40 Million Cores
SC22: International Conference for High Performance Computing, Networking, Storage and Analysis(2022)
摘要
The SSSP kernel was first introduced into the Graph 500 benchmark in 2017. However, there has been no result from a full-scale world-top supercomputer. The primary reason is the poor work-inefficiency of existing algorithms at large scales. In this paper, we propose an SSSP implementation for The Newest Generation Sunway Supercomputer,including an SSSP algorithm to achieve work-efficiency, along with an adaptive dense/sparse-mode selection approach to achieve communication-efficiency. Our implementation reaches 7638 GTEPS, with 103158 processors (over 40 million cores), and achieves 3.7× in performance and 512× in graph size compared with the current top one on the Graph 500 SSSP list. Based on our experience of running extreme-scale SSSP, we uncover the root cause of its poor scalability: the weight distribution allows edges with weights close to zero, making the SSSP tree deeper on larger graphs. We further explore a scalability-friendly weight distribution by setting a non-zero lower bound to the edge weights.
更多查看译文
关键词
Graphs,Benchmark testing,Scalability,Super-computers,Shortest path problem
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要