Metropolis Algorithms for Representative Subgraph Sampling

ICDM(2008)

引用 177|浏览46
暂无评分
摘要
While data mining in chemoinformatics studied graph data with dozens of nodes, systems biology and the Internet are now generating graph data with thousands and millions of nodes. Hence data mining faces the algorithmic challenge of coping with this significant increase in graph size: Classic algorithms for data analysis are often too expensive and too slow on large graphs. While one strategy to overcome this problem is to design novel efficient algorithms, the other is to 'reduce' the size of the large graph by sampling. This is the scope of this paper: We will present novel Metropolis algorithms for sampling a 'representative' small subgraph from the original large graph, with 'representative' describing the requirement that the sample shall preserve crucial graph properties of the original graph. In our experiments, we improve over the pioneering work of Leskovec and Faloutsos (KDD 2006), by producing representative subgraph samples that are both smaller and of higher quality than those produced by other methods from the literature.
更多
查看译文
关键词
data analysis,graph size,data mining,original graph,representative subgraph sampling,original large graph,crucial graph property,metropolis algorithms,novel metropolis algorithm,graph data,representative subgraph sample,large graph,system biology,algorithm design and analysis,internet,simulated annealing,convergence,markov chain monte carlo,graph theory,metropolis algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要