Fast and Effective Active Clustering Ensemble Based on Density Peak

IEEE Transactions on Neural Networks and Learning Systems(2021)

引用 23|浏览93
暂无评分
摘要
Semisupervised clustering methods improve performance by randomly selecting pairwise constraints, which may lead to redundancy and instability. In this context, active clustering is proposed to maximize the efficacy of annotations by effectively using pairwise constraints. However, existing methods lack an overall consideration of the querying criteria and repeatedly run semisupervised clustering to update labels. In this work, we first propose an active density peak (ADP) clustering algorithm that considers both representativeness and informativeness. Representative instances are selected to capture data patterns, while informative instances are queried to reduce the uncertainty of clustering results. Meanwhile, we design a fast-update-strategy to update labels efficiently. In addition, we propose an active clustering ensemble framework that combines local and global uncertainties to query the most ambiguous instances for better separation between the clusters. A weighted voting consensus method is introduced for better integration of clustering results. We conducted experiments by comparing our methods with state-of-the-art methods on real-world data sets. Experimental results demonstrate the effectiveness of our methods.
更多
查看译文
关键词
Active clustering,clustering ensemble,density peak (DP) clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要