A replica analysis of under-bagging

arxiv(2024)

引用 0|浏览0
暂无评分
摘要
A sharp asymptotics of the under-bagging (UB) method, which is a popular ensemble learning method for training classifiers from an imbalanced data, is derived and used to compare with several other standard methods for learning from imbalanced data, in the scenario where a linear classifier is trained from a binary mixture data. The methods compared include the under-sampling (US) method, which trains a model using a single realization of the subsampled dataset, and the simple weighting (SW) method, which trains a model with a weighted loss on the entire data. It is shown that the performance of UB is improved by increasing the size of the majority class, even if the class imbalance can be large, especially when the size of the minority class is small. This is in contrast to US, whose performance does not change as the size of the majority class increases, and SW, whose performance decreases as the imbalance increases. These results are different from the case of the naive bagging in training generalized linear models without considering the structure of class imbalance, indicating the intrinsic difference between the ensembling and the direct regularization on the parameters.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要