Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms

arXiv (Cornell University)(2021)

引用 0|浏览0
暂无评分
摘要
When approaching a clustering problem, choosing the right clustering algorithm and parameters is essential, as each clustering algorithm is proficient at finding clusters of a particular nature. Due to the unsupervised nature of clustering algorithms, there are no ground truth values available for empirical evaluation, which makes automation of the parameter selection process through hyperparameter tuning difficult. Previous approaches to hyperparameter tuning for clustering algorithms have relied on internal metrics, which are often biased towards certain algorithms, or having some ground truth labels available, moving the problem into the semi-supervised space. This preliminary study proposes a framework for semi-automated hyperparameter tuning of clustering problems, using a grid search to develop a series of graphs and easy to interpret metrics that can then be used for more efficient domain-specific evaluation. Preliminary results show that internal metrics are unable to capture the semantic quality of the clusters developed and approaches driven by internal metrics would come to different conclusions than those driven by manual evaluation.
更多
查看译文
关键词
clustering algorithms,semi-automated
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要