Automated change-prone class prediction on unlabeled dataset using unsupervised method.

Information and Software Technology(2017)

引用 21|浏览31
暂无评分
摘要
Abstract Context Software change-prone class prediction can enhance software decision making activities during software maintenance (e.g., resource allocating). Researchers have proposed many change-prone class prediction approaches and most are effective on labeled datasets (projects with historical labeled data). These approaches usually build a supervised model by learning from historical labeled data. However, a major challenge is that this typical change-prone prediction setting cannot be used for unlabeled datasets (e.g., new projects or projects with limited historical data). Although the cross-project prediction is a solution on unlabeled dataset, it needs the prior labeled data from other projects and how to select the appropriate training project is a difficult task. Objective We aim to build a change-prone class prediction model on unlabeled datasets without the need of prior labeled data. Method We propose to tackle this task by adopting a state-of-art unsupervised method, namely CLAMI. In addition, we propose a novel unsupervised approach CLAMI+ by extending CLAMI. The key idea is to enable change-prone class prediction on unlabeled dataset by learning from itself. Results The experiments among 14 open source projects show that the unsupervised methods achieve comparable results to the typical supervised within-project and cross-project prediction baselines in average and the proposed CLAMI+ slightly improves the CLAMI method in average. Conclusion Our method discovers that it is effective for building change-prone class prediction model by using unsupervised method. It is convenient for practical usage in industry, since it does not need prior labeled data.
更多
查看译文
关键词
Software maintenance,Change-prone prediction,Unlabeled dataset,Unsupervised prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要