A hierarchical clustering algorithm based on noise removal

International Journal of Machine Learning and Cybernetics(2018)

引用 13|浏览43
暂无评分
摘要
Noise is irrelevant or meaningless data and hinders most types of data analysis. The existing clustering algorithms seldom take the noise points into consideration and cannot detect arbitrary-shaped clusters. This paper presents a Hierarchical Clustering algorithm Based on Noise Removal (HCBNR). It is robust against noise points and good at discovering clusters with arbitrary shapes. In this work, natural neighbor-based density is applied to remove noise points in a data set firstly. Then we construct a saturated neighbor graph on the rest points, and a novel modularity-based graph partitioning algorithm is used to divide the graph into small clusters. Finally, the small clusters are repeatedly merged according to a novel similarity metric between clusters until the desired cluster number is obtained. The experimental results on synthetic data sets and real data sets show that our method can accurately identify noise points and obtain better clustering results than existing clustering algorithms when discovering arbitrary-shaped clusters.
更多
查看译文
关键词
Hierarchical clustering, Natural neighbor, Noise removal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要