Faster Streaming and Scalable Algorithms for Finding Directed Dense Subgraphs in Large Graphs.
CoRR(2023)
摘要
Finding dense subgraphs is a fundamental algorithmic tool in data mining,
community detection, and clustering. In this problem, one aims to find an
induced subgraph whose edge-to-vertex ratio is maximized.
We study the directed case of this question in the context of semi-streaming
and massively parallel algorithms. In particular, we show that it is possible
to find a $(2+\epsilon)$ approximation on randomized streams even in a single
pass by using $O(n \cdot {\rm poly} \log n)$ memory on $n$-vertex graphs. Our
result improves over prior works, which were designed for arbitrary-ordered
streams: the algorithm by Bahmani et al. (VLDB 2012) which uses $O(\log n)$
passes, and the work by Esfandiari et al. (2015) which makes one pass but uses
$O(n^{3/2})$ memory. Moreover, our techniques extend to the Massively Parallel
Computation model yielding $O(1)$ rounds in the super-linear and $O(\sqrt{\log
n})$ rounds in the nearly-linear memory regime. This constitutes a quadratic
improvement over state-of-the-art bounds by Bahmani et al. (VLDB 2012 and WAW
2014), which require $O(\log n)$ rounds even in the super-linear memory regime.
Finally, we empirically evaluate our single-pass semi-streaming algorithm on
$6$ benchmarks and show that, even on non-randomly ordered streams, the quality
of its output is essentially the same as that of Bahmani et al. (VLDB 2012)
while it is $2$ times faster on large graphs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要