The Subtle Art of Digging for Defects: Analyzing Features for Defect Prediction in Java Projects

Geanderson Santos,Adriano Veloso,Eduardo Figueiredo

ENASE: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING（2022）

引用 0|浏览3

暂无评分

摘要

The task to predict software defects remains a topic of investigation in software engineering and machine learning communities. The current literature proposed numerous machine learning models and software features to anticipate defects in source code. Furthermore, as distinct machine learning approaches emerged in the research community, increased possibilities for predicting defects are made possible. In this paper, we discuss the results of using a previously applied dataset to predict software defects. The dataset contains 47,618 classes from 53 Java software projects. Besides, the data covers 66 software features related to numerous aspects of the code. As a result of our investigation, we compare eight machine learning models. For the candidate models, we employed Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbor (KNN), Multilayer Perceptron (MLP), Support Vector Machine (SVM), Decision Tree (CART), Random Forest (RF), and Gradient Boosting Machine (GBM). To contrast the models' performance, we used five evaluation metrics frequently applied in the defect prediction literature. We hope this approach can guide more discussions about benchmark machine learning models for defect prediction.

查看译文

关键词

Defect Prediction, Software Features for Defect Prediction, Machine Learning Models

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要