HADI: Mining Radii of Large Graphs
TKDD(2011)
摘要
Given large, multimillion-node graphs (e.g., Facebook, Web-crawls, etc.), how do they evolve over time? How are they connected? What are the central nodes and the outliers? In this article we define the Radius plot of a graph and show how it can answer these questions. However, computing the Radius plot is prohibitively expensive for graphs reaching the planetary scale. There are two major contributions in this article: (a) We propose HADI (HAdoop DIameter and radii estimator), a carefully designed and fine-tuned algorithm to compute the radii and the diameter of massive graphs, that runs on the top of the Hadoop/MapReduce system, with excellent scale-up on the number of available machines (b) We run HADI on several real world datasets including YahooWeb (6B edges, 1/8 of a Terabyte), one of the largest public graphs ever analyzed. Thanks to HADI, we report fascinating patterns on large networks, like the surprisingly small effective diameter, the multimodal/bimodal shape of the Radius plot, and its palindrome motion over time.
更多查看译文
关键词
hadoop diameter,excellent scale-up,radius plot,small web,mining radii,large graphs,mapreduce system,available machine,central node,hadi,radii estimator,hadoop,bimodal shape,graph mining,large network,small effective diameter,web crawling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络