Sampling Based Approximate Spectral Clustering Ensemble For Partitioning Datasets

2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)(2016)

引用 7|浏览7
暂无评分
摘要
Spectral clustering is able to extract clusters with various characteristics without a parametric model, however it is infeasible for large datasets due to its high computational cost and memory requirement. Approximate spectral clustering (ASC) addresses this challenge by a representative-based partitioning approach which first finds a set of data representatives either by sampling or quantization, then applies spectral clustering on them. To achieve an optimal partitioning with ASC, several sampling or quantization methods together with advanced similarity criteria have been recently proposed. While quantization is more accurate than sampling in expense of heavy computation, and geodesic based hybrid similarity criteria are often more informative than others, there is no unique solution optimum for all datasets. Alternatively, we propose to use ensemble learning to produce a consensus partitioning constructed from different set of representatives and similarity criteria. The proposed ensemble (SASCE) not only produces a relatively more accurate partitioning but also eliminates the need to determine the best pair (the optimum set of representatives and the optimum similarity). Thanks to the efficient similarity definition on the representative level, the SASCE can be powerful for clustering small and medium datasets, outperforming traditional clustering approaches and their ensembles.
更多
查看译文
关键词
approximate spectral clustering,cluster ensemble,maximum voting,geodesic distances,hybrid similarity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要