MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder

SOFTWARE QUALITY JOURNAL(2021)

引用 17|浏览12
暂无评分
摘要
Heterogeneous cross-project defect prediction (HCPDP) is aimed at building a defect prediction model for the target project by reusing datasets from source projects, where the source project datasets and target project dataset have different features. Most existing HCPDP methods only remove redundant or unrelated features without exploring the underlying features of cross-project datasets. Additionally, when the transfer learning method is used in HCPDP, these methods ignore the negative effect of transfer learning. In this paper, we propose a novel HCPDP method called multi-source heterogeneous cross-project defect prediction (MHCPDP). To reduce the gap between the target datasets and the source datasets, MHCPDP uses the autoencoder to extract the intermediate features from the original datasets instead of simply removing redundant and unrelated features and adopts a modified autoencoder algorithm to make instance selection for eliminating irrelevant instances from the source domain datasets. Furthermore, by incorporating multiple source projects to increase the number of source datasets, MHCPDP develops a multi-source transfer learning algorithm to reduce the impact of negative transfers and upgrade the performance of the classifier. We comprehensively evaluate MHCPDP on five open source datasets; our experimental results show that MHCPDP not only has significant improvement in two performance metrics but also overcomes the shortcomings of the conventional HCPDP methods.
更多
查看译文
关键词
Autoencoder,Heterogeneous cross-project defect prediction,Multi-source transfer learning,Modified autoencoder
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要