Genetic Algorithm for Finding Cluster Hierarchies.

DEXA'11: Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I(2011)

引用 1|浏览77
暂无评分
摘要
Hierarchical clustering algorithms have been studied extensively in the last years. However, existing approaches for hierarchical clustering suffer from several drawbacks. The representation of the results is often hard to interpret even for large datasets. Many approaches are not robust to noise objects or overcome these limitation only by difficult parameter settings. As many approaches heavily depend on their initialization, the resulting hierarchical clustering get stuck in a local optimum. In this paper, we propose the novel geneticbased hierarchical clustering algorithm GACH (Genetic Algorithm for finding Cluster Hierarchies) that solves those problems by a beneficial combination of genetic algorithms, information theory and model-based clustering. GACH is capable to find the correct number of model parameters using the Minimum Description Length (MDL) principle and does not depend on the initialization by the use of a population-based stochastic search which ensures a thorough exploration of the search space. Moreover, outliers are handled as they are assigned to appropriate inner nodes of the hierarchy or even to the root. An extensive evaluation of GACH on synthetic as well as on real data demonstrates the superiority of our algorithm over several existing approaches.
更多
查看译文
关键词
Genetic Algorithm, Single Link, Normalize Mutual Information, Crossover Rate, Minimum Description Length
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要