An Evaluation Measure for Learning from Imbalanced Data Based on Asymmetric Beta Distribution
Classification and Data Mining(2013)
摘要
Hand (Mach Learn 77:103–123, 2009) has shown that the AUC has a serious deficiency since it implicitly uses different misclassification cost distributions for different classifiers. Thus, using the AUC can be compared to using different metrics to evaluate different classifiers. To overcome this incoherence, the H measure was proposed, which uses a symmetric Beta distribution to replace the implicit cost weight distributions in the AUC. When learning from imbalanced data, misclassifying a minority class example is much more serious than misclassifying a majority class example. To take different misclassification costs into account, we propose using an asymmetric distribution (B42) instead of a symmetric one. Experimental results on 36 imbalanced datasets using SVMs and logistic regression show that the asymmetric B42 could be a good choice for evaluating in imbalanced data environments since it puts more weight on the minority class.
更多查看译文
关键词
Imbalanced Data,Symmetric Beta Distribution,Minority Class Examples,Credit Card Transaction Fraud,False Negative Cost
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络