Systematic Review of Machine Learning–Based Open-Source Software Maintenance Effort Estimation

Recent Advances in Computer Science and Communications(2022)

引用 0|浏览0
暂无评分
摘要
Background: Software maintenance is known as a laborious activity in the software lifecycle and often considered more expensive than other activities. Open-source software (OSS) has gained considerable acceptance in the industry recently, and the maintenance effort estimation (MEE) of such software has emerged as an important research topic. In this context, researchers have conducted many open-source software maintenance effort estimation (O-MEE) studies based on statistical as well as machine learning techniques for better estimation. Objective: The objective of this study is to perform a systematic literature review (SLR) to analyze and summarize the empirical evidence of O-MEE ML techniques in current research through a set of five research questions (RQs) related to several criteria (e.g. data pre-processing tasks, data mining tasks, tuning parameter methods, accuracy criteria and statistical tests, as well as ML techniques reported in the literature that outperformed). Method: We performed a systematic literature review of 36 primary empirical studies published from 2000 to June 2020, selected based on an automated search of six digital databases. Results: The findings show that bayesian networks, decision tree, support vector machine and instance-based reasoning were the ML techniques most used; few studies opted for ensemble or hybrid techniques. Researchers have paid less attention to O-MEE data pre-processing in terms of feature selection, methods that handle missing values and imbalanced datasets, and tuning parameters of ML techniques. Classification data mining is the task most addressed using different accuracy criteria such as Precision, Recall, and Accuracy, as well as Wilcoxon and Mann-Whitney statistical tests. Conclusion: This SLR identifies a number of gaps in the current research and suggests areas for further investigation. For instance, since OSS includes different data source formats, researchers should pay more attention to data pre-processing and develop new models using ensemble techniques since they have proved to perform better.
更多
查看译文
关键词
maintenance,software,systematic review,learning-based,open-source
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要