Threshold prediction for detecting rare positive samples using a meta-learner

PATTERN ANALYSIS AND APPLICATIONS(2022)

引用 2|浏览2
暂无评分
摘要
Threshold-moving is one of the several techniques employed in correcting the bias of binary classifiers towards the majority class. In this approach, the decision threshold is adjusted to detect the minority class at the cost of increased misclassification of the majority. In practice, selecting a good threshold using cross-validation on the training data is not feasible in some problems since there are only a few minority samples. In this study, building a meta-learner for threshold prediction to tackle the threshold estimation problem in the case of rare positive samples is addressed. Novel meta-features are suggested to quantify the imbalance characteristics of the data sets and the patterns among the prediction scores. A random forest-based threshold prediction model is constructed using these meta-features extracted from the score space of external data. The models obtained are then employed to estimate the optimal thresholds for previously unseen datasets. The random forest-based meta-learner that employs implicitly selected subset of the proposed meta-features and encodes information from multiple external sources in the form of different trees is evaluated by using 52 imbalanced datasets. In the first set of experiments, the best-fitting thresholds are computed for SVM and logistic regression classifiers that are trained using the original imbalanced training sets. The experiments are repeated by using ensembles of multiple learners, each trained using a different balanced data set. It is observed that the proposed approach provides better F -score when compared to alternative threshold-moving and balancing techniques.
更多
查看译文
关键词
Imbalance learning, Thresholding, Classifier ensembles, Balancing, Binary classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要