$k$k

Coordinate Descent Method for $k$k-means

IEEE Transactions on Pattern Analysis and Machine Intelligence(2022)

引用 22|浏览4
暂无评分
摘要
$k$k-means method using Lloyd heuristic is a traditional clustering method which has played a key role in multiple downstream tasks of machine learning because of its simplicity. However, Lloyd heuristic always finds a bad local minimum, i.e., the bad local minimum makes objective function value not small enough, which limits the performance of $k$k-means. In this paper, we use coordinate descent (CD) method to solve the problem. First, we show that the $k$k-means minimization problem can be reformulated as a trace maximization problem, then a simple and efficient coordinate descent scheme is proposed to solve the maximization problem. Two interesting findings through theory are that Lloyd cannot decrease the objective function value of $k$k-means produced by our CD further, and our proposed method CD to solve $k$k-means problem can avoid produce empty clusters. In addition, according to the computational complexity analysis, it is verified CD has the same time complexity with original $k$k-means method. Extensive experiments including statistical hypothesis testing, on several real-world datasets with varying number of clusters, varying number of samples and varying number of dimensions show that CD performs better compared to Lloyd, i.e., lower objective value, better local minimum and fewer iterations. And CD is more robust to initialization than Lloyd whether the initialization strategy is random or initialization of $k$k-means++.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要