A simple distributed fuzzy c-means clustering method via the technique of Map-Reduce

Yashuang Mu, Tian Liu, Jiayan Li, Kai Hou, Lei Zhang,Jiangyong Wang

2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC(2023)

引用 0|浏览1
暂无评分
摘要
The clustering problems are one of the fundamental data analysis tasks in machine learning and data mining. However, many traditional clustering methods cannot directly deal with large-scale data sets as the restriction of memory. In this paper, we develop a simple distributed fuzzy c-means clustering method via the technique of Map-Reduce for large-scale data sets. First, a distributed data dividing method is designed to randomly partition the original data set to several data blocks. Then, we establish a data center determining method through applying fuzzy c-means clustering algorithm in the framework of Map-Reduce. Finally, a distributed clustering method is proposed according to the distance of between each sample and the nearest center. The experimental studies verify the feasibility in clustering accuracy and the parallelism in clustering time.
更多
查看译文
关键词
Clustering analysis,Distributed computing,Map-Reduce,Fuzzy c-means
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要