C4y: a metric for distributed IoT clustering

Yewang Chen, Yuanyuan Yang,Yi Chen

CCF Transactions on Pervasive Computing and Interaction(2024)

引用 0|浏览0
暂无评分
摘要
In the era of the Internet of Things (IoT), the proliferation of interconnected devices and sensors has led to an unprecedented deluge of data. Effective data analysis, particularly clustering, has become pivotal in handling the challenges posed by the vast volumes of IoT data. Clustering evaluation plays a critical role in determining the quality of clustering results. However, traditional cluster validity metrics are ill-suited for the distributed nature of IoT data. To address this gap, we introduce a novel distributed clustering evaluation metric named C4Y. It is rooted in sampling theory and is designed to evaluate the performance of clustering algorithms in distributed IoT environments. It operates based on two key principles: (1) Each dataset within distributed IoT node is treated as a sample of the entire dataset, and the expectation is that each sample exhibits similar data distribution, including category distribution, to the overall dataset. (2) It assumes that the centers of each category in all samples conform to a Gaussian distribution. This metric quantifies the extent to which category centers in different samples adhere to Gaussian distributions and measures the dissimilarity between these categories. Empirical results across various public datasets, spanning diverse sizes and dimensions, demonstrate that C4Y effectively assesses the performance of distributed clustering algorithms. This innovative approach promises to advance data analytics within the realm of distributed IoT data, underpinning the development of sophisticated IoT systems.
更多
查看译文
关键词
Clustering,Clustering validity metric,Distributed context,Sampling theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要