On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices.

IPDPS Workshops(2018)

引用 9|浏览68
暂无评分
摘要
Researchers developing implementations of distributed graph analytic algorithms require graph generators that yield graphs sharing the challenging characteristics of real-world graphs (small-world, scale-free, heavy-tailed degree distribution) with efficiently calculable ground-truth solutions to the desired output. Reproducibility for current generators [1] used in benchmarking are somewhat lacking in this respect due to their randomness: the output of a desired graph analytic can only be compared to expected values and not exact ground truth. Nonstochastic Kronecker product graphs [2] meet these design criteria for several graph analytics. Here we show that many flavors of triangle participation can be cheaply calculated while generating a Kronecker product graph. Given two medium-sized scale-free graphs with adjacency matrices A and B, their Kronecker product graph has adjacency matrix C = A ? B. Such graphs are highly compressible: |E| edges are represented in O(|?|^1/2) memory and can be built in a distributed setting from small data structures, making them easy to share in compressed form. Many interesting graph calculations have worst-case complexity bounds O(|?|^p) and often these are reduced to O(|?|^p/2) for Kronecker product graphs, when a Kronecker formula can be derived yielding the sought calculation on C in terms of related calculations on A and B. We focus on deriving formulas for triangle participation at vertices, tC, a vector storing the number of triangles that every vertex is involved in, and triangle participation at edges, ?_C, a sparse matrix storing the number of triangles at every edge. When factors A and B are undirected, C is also undirected. In the case when both factors have no self loops we show t_C = 2 t_A ? t_B, ?_C = ?_A ? ?_B. Moreover, we derive the respective formulas when A and B have self loops, which boosts the triangle counts for the associated vertices/edges in C. We additionally demonstrate strong assumptions on B that allow the truss decomposition of C to be derived cheaply from the truss decomposition of A. We extend these results and show Kronecker formulas for triangle participation in both directed graphs and undirected, vertex-labeled graphs. In these classes of graphs each vertex / edge can participate in many different types of triangles.
更多
查看译文
关键词
graph generation,Kronecker graph,triangle counting,directed graphs,labeled graphs,truss decomposition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要