Self-paced ensemble and big data identification: a classification of substantial imbalance computational analysis

The Journal of Supercomputing(2023)

引用 0|浏览8
暂无评分
摘要
This research paper focuses on the challenges associated with learning classifiers from large-scale, highly imbalanced datasets prevalent in many real-world applications. Traditional algorithms learning often need better performance and high computational efficiency when dealing with imbalanced data. Factors such as class imbalance, noise, and class overlap make it demanding to learn effective classifiers. In this study, we propose a novel self-paced ensemble framework for classifying imbalanced data. The framework employs under-sampling to self-harmonize data hardness and build a robust ensemble. Extensive experimental testing demonstrates promising results in handling overlapping classes and skewed distributions while maintaining computational efficiency. The self-paced ensemble method addresses the challenges of high imbalance ratios, class overlap, and noise presence in large-scale imbalanced classification problems. By incorporating the knowledge of these challenges into our learning framework, we establish the concept of classification hardness distribution, and the self-paced ensemble is a revolutionary learning paradigm for massive imbalance categorization, capable of improving the performance of existing learning algorithms on imbalanced data and providing better results for future applications.
更多
查看译文
关键词
Self-paced ensemble,Big data,Classification,Computational,Simulation,Substantial imbalance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要