Software effort estimation based on open source projects: Case study of Github.

Information and Software Technology(2017)

引用 37|浏览66
暂无评分
摘要
Abstract Context Managers usually want to pre-estimate the effort of a new project for reasonably dividing their limited resources. In reality, it is common practice to train a prediction model based on effort datasets to predict the effort required by a project. Sufficient data is the basis for training a good estimator, yet most of the data owners are unwilling to share their closed source project (CSP) effort data due to the privacy concerns, which means that we can only obtain a small number of effort data. Effort estimator built on the limited data usually cannot satisfy the practical requirement. Objective We aim to provide a method which can be used to collect sufficient data for solving the problem of lack of training data when building an effort estimation model. Method We propose to mine GitHub to collect sufficient and diverse real-life effort data for effort estimation. Specifically, we first demonstrate the feasibility of our cost metrics (including functional point analysis and personnel factors). In particular, we design a quantitative method for evaluating the personnel metrics based on GitHub data. Then we design a samples incremental approach based on AdaBoost and Classification And Regression Tree (ABCART) to make the collected dataset owns dynamic expansion capability. Results Experimental results on the collected dataset show that: (1) the personnel factor is helpful for improving the performance of the effort estimation. (2) the proposed ABCART algorithm can increase the samples of the collected dataset online. (3) the estimators built on the collected data can achieve comparable performance with those of the estimators which built on existing effort datasets. Conclusions Effort estimation based on Open Source Project (OSP) is an effective way for getting the effort required by a new project, especially for the case of lacking training data.
更多
查看译文
关键词
Software effort estimation,Effort data collection,Automated function point,Open source project,AdaBoost
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要