DACA: Distributed adaptive grid decision graph based clustering algorithm

SOFTWARE-PRACTICE & EXPERIENCE(2022)

引用 1|浏览12
暂无评分
摘要
Clustering algorithms play a very important role in machine learning. With the development of big-data artificial intelligence, distributed parallel algorithms have become an important research field. To reduce the computational complexity and running time of large-scale datasets in the clustering process, this study proposes a distributed clustering algorithm DACA (distributed adaptive grid decision graph based clustering algorithm). In a distributed environment, DACA uses relative entropy to adaptively mesh the data to form an obvious sparse grid and dense grid. Then, the decision graph is used to determine the cluster center mesh object. Finally, the KD-tree is used to accelerate the determination of the cluster center of sparse points to complete clustering. The algorithm is implemented using the popular Apache Spark computing framework, compared with other distributed clustering algorithms, DACA can adaptively divide the grid according to the data distribution to obtain better clustering effect. At the same time, KD tree algorithm is used to speed up the decision-making of clustering center. Numerous experiments show that the DACA algorithm has excellent performance and accuracy on six standard datasets and real GPS trajectory datasets.
更多
查看译文
关键词
adaptive grid division,clustering algorithms,decision graphs,distributed,KD-tree
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要