Feature Selection and Reduction Based on SMOTE and Information Gain for Sentiment Mining

Patcharanikarn Pongthanoo,Wararat Songpan

2020 5th International Conference on Computer and Communication Systems (ICCCS)(2020)

引用 2|浏览0
暂无评分
摘要
The problem of classifying the sentiment analysis was found that there were many features for the sentiments that caused the less accuracy in classifying the sentiment, for example the negative class for those sentiments. The purpose of this study was to Figure out the amount of suitable features of each class for learning data with applied integration of information gain (IG) technique which used to reduce the factor and integrated to synthetic minority over-sampling technique (SMOTE) in order to adjust the imbalanced class. In this study, it enhanced the efficiency of accuracy in every class, and then it was evaluated by four methods consisting of J48, Naïve Bayes, k-Nearest Neighbor where k=1, 2, 3, and Support Vector Machine (SVM) to compare the efficiency of accuracy. The TP Rate was employed as the evaluation metric for the accuracy of each class including the positive and the negative whereas the efficiency of accuracy was the TP Rate of Positive and the TP Rate of Negative. As the results of this study, it revealed that the IG and the SMOTE suggested the number of suitable features for the sentiment analysis. SVM method given the higher efficiency of the accuracy, that obtained the TP Rate of Positive as 86.50 % and TP Rate of Negative as 89.10 % and the level of SMOTE suitable by 300 %.
更多
查看译文
关键词
feature selection,feature reduce,SMOTE,information gain,imbalanced data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要