A Novel Split-Merge-Evolve k Clustering Algorithm

2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService)(2018)

引用 6|浏览10
暂无评分
摘要
Clustering algorithms are used in a large number of big data analytic applications spread across various application domains, including network management. We propose a novel Split-merge-evolve algorithm for clustering data into k number of clusters. The algorithm randomly divides data into k clusters initially, then repeatedly splits bad clusters and merges closest clusters to evolve the final clustering result. A key metric during the clustering process of the Split-merge-evolve algorithm is a user chosen or defined clustering quality metric or internal evaluation. The algorithm evolves the clustering result towards the user expected high quality result, although there is no ground truth or labelled data involved during the clustering process. The algorithm design is flexible in its implementation, with various common techniques, such as centroid and connectivity based measures that can be used in its implementation. The algorithm is easy to implement and effective. With 4 datasets, including 2 real life datasets in our experiments, the Split-merge-evolve algorithm performs better than both most commonly used K-means and Agglomerative hierarchical algorithms.
更多
查看译文
关键词
Clustering,K means,Agglomerative hierarchical,data mining,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要