Improving The Efficiency And Efficacy Of The K-Means Clustering Algorithm Through A New Convergence Condition

ICCSA'07: Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III(2007)

引用 17|浏览7
暂无评分
摘要
Clustering problems arise in many different applications: machine learning, data mining, knowledge discovery, data compression, vector quantization, pattern recognition and pattern classification. One of the most popular and widely studied clustering methods is K-means. Several improvements to the standard K-means algorithm have been carried out, most of them related to the initial parameter values. In contrast, this article proposes an improvement using a new convergence condition that consists of stopping the execution when a local optimum is found or no more object exchanges among groups can be performed. For assessing the improvement attained, the modified algorithm (Early Stop K-means) was tested on six databases of the UCI repository, and the results were compared against SPSS, Weka and the standard K-means algorithm. Experimentally Early Stop K-means obtained important reductions in the number of iterations and improvements in the solution quality with respect to the other algorithms.
更多
查看译文
关键词
Convergence Condition, Early Stop, Initial Centroid, Diabetes Database, Heart Disease Dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要