Active learning through label error statistical methods

Knowledge-Based Systems(2020)

引用 17|浏览43
暂无评分
摘要
Clustering-based active learning splits data into a number of blocks and queries the labels of the most critical instances. An active learner must decide how to choose these critical instances and how to split the blocks. In this paper, we present theoretical and practical statistical methods for analyzing the relationship between the label error and the neighbor radius, and design new split and selection strategies to handle these two issues. First, we define statistical functions for the label error based on a single instance and instance pairs. Second, we build practical statistical models, calculate empirical label errors, and guide the block splitting process. Third, using these practical models, we develop a center-and-edge instance selection strategy for choosing critical instances. Fourth, we design a new algorithm called active learning through label error statistical methods (ALSE). Learning experiments were performed with 20 datasets from various domains. The results of significance tests verify the effectiveness of ALSE and its superiority over state-of-the-art active learning algorithms.
更多
查看译文
关键词
Active learning,Clustering,Label error statistical model,Probabilistic lipschitzness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要