Co-op Training - a Semi-supervised Learning Method for Data Streams.

SMC(2021)

引用 1|浏览10
暂无评分
摘要
Applying Machine learning algorithms to data streams is a challenging task because traditional strategies suppose datasets to be labeled, finite and stationary. In the context of data streams, where the data is generated in real-time and the labels may be missing due to the high cost of the labeling process, the proposal of semi-supervised learning (SSL) strategies to learn from labeled and unlabeled data at the same time seems to be a viable solution, despite also being challenging. In this paper, we present a novel approach to handle missing labels for classification learning in data streams, named co-op training, which is based on self-training incremental and co-training. In a controlled experiment, we execute the proposed algorithm, along with most well-known semi-supervised learning strategies, in 11 artificial and real-world datasets, and compare the results. We found our strategy to be more accurate than the other SSL algorithms in most datasets, also presenting better run-times when accuracies were similar. These methods are implemented in the Massive Online Analysis (MOA) open-source software as an internal benchmark component, to help researchers to run experimental comparisons on semi-supervised learning on data streams easily.
更多
查看译文
关键词
semisupervised learning method,data streams,classification learning,machine learning algorithms,self-training incremental,SSL algorithms,massive online analysis,MOA,open-source software,internal benchmark component
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要