Scalable Edge Clustering of Dynamic Graphs via Weighted Line Graphs.
CoRR(2023)
摘要
Timestamped relational datasets consisting of records between pairs of
entities are ubiquitous in data and network science. For applications like
peer-to-peer communication, email, social network interactions, and computer
network security, it makes sense to organize these records into groups based on
how and when they are occurring. Weighted line graphs offer a natural way to
model how records are related in such datasets but for large real-world graph
topologies the complexity of building and utilizing the line graph is
prohibitive. We present an algorithm to cluster the edges of a dynamic graph
via the associated line graph without forming it explicitly.
We outline a novel hierarchical dynamic graph edge clustering approach that
efficiently breaks massive relational datasets into small sets of edges
containing events at various timescales. This is in stark contrast to
traditional graph clustering algorithms that prioritize highly connected
community structures. Our approach relies on constructing a sufficient subgraph
of a weighted line graph and applying a hierarchical agglomerative clustering.
This work draws particular inspiration from HDBSCAN.
We present a parallel algorithm and show that it is able to break
billion-scale dynamic graphs into small sets that correlate in topology and
time. The entire clustering process for a graph with $O(10 \text{ billion})$
edges takes just a few minutes of run time on 256 nodes of a distributed
compute environment. We argue how the output of the edge clustering is useful
for a multitude of data visualization and powerful machine learning tasks, both
involving the original massive dynamic graph data and/or the non-relational
metadata. Finally, we demonstrate its use on a real-world large-scale directed
dynamic graph and describe how it can be extended to dynamic hypergraphs and
graphs with unstructured data living on vertices and edges.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要