Extended natural neighborhood for SMOTE and its variants in imbalanced classification

Engineering Applications of Artificial Intelligence(2023)

引用 0|浏览13
暂无评分
摘要
Imbalanced data classification is a challenging issue encountered in many practical applications. Synthetic minority oversampling technique (SMOTE) and its variants are popular resampling methods. However, in most of these methods, the neighborhood determined by k-nearest neighbor (kNN) cannot reflect the local distribution precisely, leading to the generation of noisy examples. To solve this problem, we propose a neighborhood concept without parameter k called extended natural neighbor (ENaN), which is derived from natural neighbor (NaN). ENaN unites kNN and reverse kNN to determine neighbors adaptively according to the sample distribution. Compared to NaN, ENaN explores broad neighborhoods, which facilitates to improve the quality of generated examples. ENaN-based SMOTE (ENaNSMOTE) can improve the sample distribution obtained by SMOTE and NaNSMOTE. Extensive experiments using 30 synthetic and 20 real-world datasets prove the effectiveness of ENaN in SMOTE and its variants.
更多
查看译文
关键词
Imbalanced classification,SMOTE,Extended natural neighbor,k-nearest neighbor,Reverse k-nearest neighbor
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要