NNB: An efficient nearest neighbor search method for hierarchical clustering on large datasets

IEEE International Conference on Semantic Computing(2015)

引用 8|浏览20
暂无评分
摘要
Nearest neighbor search is a key technique used in hierarchical clustering. The time complexity of standard agglomerative hierarchical clustering is O(n3), while the time complexity of more advanced hierarchical clustering algorithms (such as nearest neighbor chain) is O(n2). This paper presents a new nearest neighbor search method called nearest neighbor boundary(NNB), which first divides a large dataset into independent subsets and then finds nearest neighbor of each point in the subsets. When NNB is used, the time complexity of hierarchical clustering can be reduced to O(n log2n). Based on NNB, we propose a fast hierarchical clustering algorithm called nearest-neighbor boundary clustering(NBC), and the proposed algorithm can also be adapted to the parallel and distributed computing frameworks. The experimental results demonstrate that our proposal algorithm is practical for large datasets.
更多
查看译文
关键词
Hierarchical clustering, nearest neighbor boundary, parallel and distributed computing, MapReduce
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要