A parameter-free clustering algorithm for missing datasets
arxiv(2024)
摘要
Missing datasets, in which some objects have missing values in certain
dimensions, are prevalent in the Real-world. Existing clustering algorithms for
missing datasets first impute the missing values and then perform clustering.
However, both the imputation and clustering processes require input parameters.
Too many input parameters inevitably increase the difficulty of obtaining
accurate clustering results. Although some studies have shown that decision
graphs can replace the input parameters of clustering algorithms, current
decision graphs require equivalent dimensions among objects and are therefore
not suitable for missing datasets. To this end, we propose a Single-Dimensional
Clustering algorithm, i.e., SDC. SDC, which removes the imputation process and
adapts the decision graph to the missing datasets by splitting dimension and
partition intersection fusion, can obtain valid clustering results on the
missing datasets without input parameters. Experiments demonstrate that, across
three evaluation metrics, SDC outperforms baseline algorithms by at least
13.7
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要