Generative learning for imbalanced data using the Gaussian mixed model

Applied Soft Computing(2019)

引用 17|浏览73
暂无评分
摘要
Imbalanced data classification, an important type of classification task, is challenging for standard learning algorithms. There are different strategies to handle the problem, as popular imbalanced learning technologies, data level imbalanced learning methods have elicited ample attention from researchers in recent years. However, most data level approaches linearly generate new instances by using local neighbor information rather than based on overall data distribution. Differing from these algorithms, in this study, we develop a new data level method, namely, generative learning (GL), to deal with imbalanced problems. In GL, we fit the distribution of the original data and generate new data on the basis of the distribution by adopting the Gaussian mixed model. Generated data, including synthetic minority and majority classes, are used to train learning models. The proposed method is validated through experiments performed on real-world data sets. Results show that our approach is competitive and comparable with other methods, such as SMOTE, SMOTE-ENN, SMOTE-TomekLinks, Borderline-SMOTE, and safe-level-SMOTE. Wilcoxon signed rank test is applied, and the testing results show again the significant superiority of our proposal.
更多
查看译文
关键词
Imbalanced learning,Gaussian mixed model,Sample generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要