Towards an understanding of intra-defect associations: Implications for defect prediction

JOURNAL OF SYSTEMS AND SOFTWARE(2024)

引用 0|浏览46
暂无评分
摘要
In previous studies, when collecting defect data, if the fix of a defect spans multiple modules, each involved module is labeled as defective. In this context, the defect prediction models are built based on the features of each individual module, ignoring the potential associations between the modules involved in the same defect(referred to as "intra-defect associations"). Considering the possibility of numerous cross-module defects in practice, we hypothesize that these intra-defect associations could play a crucial role in enhancing defect prediction performance. Unfortunately, there is no empirical evidence to know that. To this end, we are motivated to conduct a comprehensive study to explore the implications of intra-defect associations for defect prediction. We first examine the proportion of cross-module defects and the relationships between the involved modules. The results reveal that, at function level, the majority of defects occur across functions, with most of the cross-module defects exhibiting implicit dependencies. Inspired by these findings, we propose a novel data processing approach for building defect prediction models. This approach leverages the intra-defect associations by merging the involved modules into new instances with mean or median variables to augment the training data. The experimental results indicate that considering intra-defect associations can significantly improve the defect prediction performance in both the ranking and classification scenarios. This study provides valuable insights into the implications of intra-defect associations for defect prediction.
更多
查看译文
关键词
Defect prediction,Intra-defect association,Data processing,Software quality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要