A Semi-Supervised Auc Optimization Method With Generative Models

2016 IEEE 16th International Conference on Data Mining (ICDM)(2016)

引用 8|浏览3
暂无评分
摘要
This paper presents a semi-supervised learning method for improving the performance of AUC-optimized classifiers by using both labeled and unlabeled samples. In actual binary classification tasks, there is often an imbalance between the numbers of positive and negative samples. For such imbalanced tasks, the area under the ROC curve (AUC) is an effective measure with which to evaluate binary classifiers. The proposed method utilizes generative models to assist the incorporation of unlabeled samples in AUC-optimized classifiers. The generative models provide prior knowledge that helps learn the distribution of unlabeled samples. To evaluate the proposed method in text classification, we employed naive Bayes models as the generative models. Our experimental results using three test collections confirmed that the proposed method provided better classifiers for imbalanced tasks than supervised AUC-optimized classifiers and semi-supervised classifiers trained to maximize the classification accuracy of labeled samples. Moreover, the proposed method improved the effect of using unlabeled samples for AUC optimization especially when we used appropriate generative models.
更多
查看译文
关键词
semi-supervised AUC optimization method,semi-supervised learning method,binary classification tasks,text classification,naive Bayes models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要