Combining Active Learning and Self-Labeling for Data Stream Mining.

PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON COMPUTER RECOGNITION SYSTEMS CORES 2017(2018)

引用 8|浏览8
暂无评分
摘要
Data stream mining is among the most vital contemporary data science challenges. In this work we concentrate on the issue of actual availability of true class labels. Assumption that the ground truth for each instance becomes known right after processing it is far from being realistic, due to usually high costs connected with its acquisition. Active learning is an attractive solution to this problem, as it selects most valuable instances for labeling. In this paper, we propose to augment the active learning module with self-labeling approach. This allows classifier to automatically label instances for which it displays the highest certainty and use them for further training. Although in this preliminary work we use a static threshold for self-labeling, the obtained results are encouraging. Our experimental study shows that this approach complements the active learning strategy and allows to improve data stream classification, especially in scenarios with very small labeling budget.
更多
查看译文
关键词
Machine learning,Data stream mining,Active learning,Semi-supervised learning,Self-labeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要