Uncertainty Based Under-Sampling for Learning Naive Bayes Classifiers Under Imbalanced Data Sets.

IEEE ACCESS(2020)

引用 32|浏览18
暂无评分
摘要
In many real world classification tasks, all data classes are not represented equally. This problem, known also as the curse of class imbalanced in data sets, has a potential impact in the training procedure of a classifier by learning a model that will be biased in favor of the majority class. In this work at hand, an under-sampling approach is proposed, which leverages the usage of a Naive Bayes classifier, in order to select the most informative instances from the available training set, based on a random initial selection. The method starts by learning a Naive Bayes classification model on a small stratified initial training set. Afterwards, it iteratively teaches its base model with the instances that the model is most uncertain about and retrains it until some criteria are satisfied. The overall performance of the proposed method has been scrutinized through a rigorous experimental procedure, being tested using six multimodal data sets, as well as another forty-four standard benchmark data sets. The empirical results indicate that the proposed under-sampling method achieves comparable classification performance in contrast to other resampling techniques, regarding several proper metrics and having performed a suitable statistical testing procedure.
更多
查看译文
关键词
Active selection,classification,naive bayes,imbalanced data,under-sampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要