K-Means Clustering Over a Large, Dynamic Network

SIAM Proceedings Series(2006)

引用 162|浏览50
暂无评分
摘要
This paper presents an algorithm for K-means clustering of data distributed over a large, dynamic network. The network is not assumed to contain any special server nodes (a peer-to-peer network) and is not assumed to be stable either with respect to the topology or the data held by nodes. The algorithm requires only local communication and synchronization at each iteration: nodes communicate and synchronize only with their topologically neighboring nodes. Due to the growing prevalence of peer-to-peer and mobile/wireless sensor networks, data analysis in large, dynamic networks is likely to garner increasing importance in the near future. To our knowledge, our algorithm represents the first K-means algorithm (a common data analysis/mining technique) to be developed for a large dynamic network. We tested our algorithm in a simulated environment of up to 1000 nodes on synthetic data. We examine its behavior in a static environment (no data or network change) and a dynamic environment. Empirical results show the algorithm demonstrates good accuracy (in both the static and dynamic environment) in that the cluster labels produced are very similar to those produced by K-means run on centralized data.
更多
查看译文
关键词
Peer-to-peer (P2P),K-means clustering,dynamic networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要