Feature Selection Techniques to Counter Class Imbalance Problem for Aging Related Bug Prediction Aging Related Bug Prediction

ISEC'18: PROCEEDINGS OF THE 11TH INNOVATIONS IN SOFTWARE ENGINEERING CONFERENCE(2018)

引用 14|浏览0
暂无评分
摘要
Aging-Related Bugs (ARBs) occur in long running systems due to error conditions caused because of accumulation of problems such as memory leakage or unreleased files and locks. Aging-Related Bugs are hard to discover during software testing and also challenging to replicate. Automatic identification and prediction of aging related fault-prone files and classes in an object oriented system can help the software quality assurance team to optimize their testing efforts. In this paper, we present a study on the application of static source code metrics and machine learning techniques to predict aging related bugs. We conduct a series of experiments on publicly available dataset from two large open-source software systems: Linux and MySQL. Class imbalance and high dimensionality are the two main technical challenges in building effective predictors for aging related bugs. We investigate the application of five different feature selection techniques (OneR, Information Gain, Gain Ratio, RELEIF and Symmetric Uncertainty) for dimensionality reduction and five different strategies (Random Under-sampling, Random Oversampling, SMOTE, SMOTEBoost and RUSBoost) to counter the effect of class imbalance in our proposed machine learning based solution approach. Experimental results reveal that the random under-sampling approach performs best followed by RUSBoost in-terms of the mean AUC metric. Statistical significance test demonstrates that there is a significant difference between the performance of the various feature selection techniques. Experimental results shows that Gain Ratio and RELEIF performs best in comparison to other strategies to address the class imbalance problem. We infer from the statistical significance test that there is no difference between the performances of the five different learning algorithms.
更多
查看译文
关键词
Aging Related Bugs,Imbalance Learning,Empirical Software Engineering,Feature Selection Techniques,Machine Learning,Predictive Modeling,Software Maintenance,Source Code Metrics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要