Leveraging Ensemble Pruning for Imbalanced Data Classification

2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC)（2018）

引用 6|浏览22

暂无评分

摘要

The effectiveness of machine learning algorithms depends on the quality of the supplied training data. Any problems embedded in the nature of data will result in obtaining incorrect classification models, especially imbalanced data distribution is among the most significant learning difficulties that can affect classifiers. As one of the classes has much more instances than the other, the learning process becomes biased towards it. Therefore, methods for alleviating the impact of skewed distributions are highly sought after. Ensemble learning has emerged as one of the leading paradigms for imbalanced data. Creation of an efficient pool of classifiers is not a trivial task and one needs to carefully select which classifiers should be combined to obtain the best predictive power. In this paper, we propose a compound ensemble pruning algorithm for imbalanced data. It aims to retain classifiers that offer the best performance on both minority and majority classes, and display a high level of diversity. Remaining learners are discarded from the pool. This is achieved by the means of a multi-criteria evolutionary algorithm. Extensive experimental study show that our proposal is able to create smaller ensembles than the state-of-the-art methods, while offering an improved robustness to imbalanced class distributions.

查看译文

关键词

multicriteria evolutionary algorithm,imbalanced class distributions,ensemble pruning,imbalanced data classification,machine learning algorithms,supplied training data,imbalanced data distribution,classifiers,skewed distributions,ensemble learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要