Active Learning Strategies for Semi-Supervised DBSCAN.

ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2014(2014)

引用 14|浏览114
暂无评分
摘要
The semi-supervised, density-based clustering algorithm SSDBSCAN extracts clusters of a given dataset from different density levels by using a small set of labeled objects. A critical assumption of SSDBSCAN is, however, that at least one labeled object for each natural cluster in the dataset is provided. This assumption may be unrealistic when only a very few labeled objects can be provided, for instance due to the cost associated with determining the class label of an object. In this paper, we introduce a novel active learning strategy to select "most representative" objects whose class label should be determined as input for SSDBSCAN. By incorporating a Laplacian Graph Regularizer into a Local Linear Reconstruction method, our proposed algorithm selects objects that can represent the whole data space well. Experiments on synthetic and real datasets show that using the proposed active learning strategy, SSDBSCAN is able to extract more meaningful clusters even when only very few labeled objects are provided.
更多
查看译文
关键词
Active learning,Semi-supervised clustering,Density-based clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要