Making caches work for graph analytics

2017 IEEE International Conference on Big Data (Big Data)(2017)

引用 118|浏览87
暂无评分
摘要
Large-scale applications implemented in today's high performance graph frameworks heavily underutilize modern hardware systems. While many graph frameworks have made substantial progress in optimizing these applications, we show that it is still possible to achieve up to 5× speedups over the fastest frameworks by greatly improving cache utilization. Previous systems have applied out-of-core processing techniques from the memory/disk boundary to the cache/DRAM boundary. However, we find that blindly applying such techniques is ineffective because the much smaller performance gap between cache and DRAM requires new designs for achieving scalable performance and low overhead. We present Cagra, a cache optimized inmemory graph framework. Cagra uses a novel technique, CSR Segmenting, to break the vertices into segments that fit in last level cache, and partitions the graph into subgraphs based on the segments. Random accesses in each subgraph are limited to one segment at a time, eliminating the much slower random accesses to DRAM. The intermediate updates from each subgraph are written into buffers sequentially and later merged using a low overhead parallel cache-aware merge. Cagra achieves speedups of up to 5× for PageRank, Collaborative Filtering, Label Propagation and Betweenness Centrality over the best published results from state-of-the-art graph frameworks, including GraphMat, Ligra and GridGraph.
更多
查看译文
关键词
graph analytics,large-scale applications,modern hardware systems,cache utilization,out-of-core processing techniques,memory/disk boundary,cache/DRAM boundary,Cagra,cache optimized inmemory graph framework,CSR Segmenting,subgraph,last level cache,graph partitioning,high performance graph framework,random accesses,buffers,low overhead parallel cache-aware merge,PageRank,collaborative filtering,label propagation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要