Efficient High-Dimensional Kernel K-Means Plus Plus With Random Projection

APPLIED SCIENCES-BASEL(2021)

引用 1|浏览1
暂无评分
摘要
Using random projection, a method to speed up both kernel k-means and centroid initialization with k-means++ is proposed. We approximate the kernel matrix and distances in a lower-dimensional space R-d before the kernel k-means clustering motivated by upper error bounds. With random projections, previous work on bounds for dot products and an improved bound for kernel methods are considered for kernel k-means. The complexities for both kernel k-means with Lloyd's algorithm and centroid initialization with k-means++ are known to be O(nkD) and Theta(nkD), respectively, with n being the number of data points, the dimensionality of input feature vectors D and the number of clusters k. The proposed method reduces the computational complexity for the kernel computation of kernel k-means from O(n(2)D) to O(n(2)d) and the subsequent computation for k-means with Lloyd's algorithm and centroid initialization from O(nkD) to O(nkd). Our experiments demonstrate that the speed-up of the clustering method with reduced dimensionality d=200 is 2 to 26 times with very little performance degradation (less than one percent) in general.
更多
查看译文
关键词
kernel k-means, k-means plus, random projection, dimensionality reduction, high dimensional data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要