Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Anchorage AK USA August, 2019(2019)

引用 78|浏览281
暂无评分
摘要
Hierarchical clustering is typically performed using algorithmic-based optimization searching over the discrete space of trees. While these optimization methods are often effective, their discreteness restricts them from many of the benefits of their continuous counterparts, such as scalable stochastic optimization and the joint optimization of multiple objectives or components of a model (e.g. end-to-end training). In this paper, we present an approach for hierarchical clustering that searches over continuous representations of trees in hyperbolic space by running gradient descent. We compactly represent uncertainty over tree structures with vectors in the Poincare ball. We show how the vectors can be optimized using an objective related to recently proposed cost functions for hierarchical clustering (Dasgupta, 2016; Wang and Wang, 2018). Using our method with a mini-batch stochastic gradient descent inference procedure, we are able to outperform prior work on clustering millions of ImageNet images by 15 points of dendrogram purity. Further, our continuous tree representation can be jointly optimized in multi-task learning applications offering a 9 point improvement over baseline methods.
更多
查看译文
关键词
clustering, gradient-based clustering, hierarchical clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要