Seeing the Whole Elephant: Systematically Understanding and Uncovering Evaluation Biases in Automated Program Repair

ACM Transactions on Software Engineering and Methodology(2022)

引用 4|浏览5
暂无评分
摘要
Evaluation is the foundation of automated program repair (APR) as it provides empirical evidence on strengths and weaknesses of APR techniques. However, the reliability of such evaluation is often threatened by various introduced biases. Consequently, bias exploration, which uncovers biases in the APR evaluation, has become a pivotal activity and performed since the early years when pioneer APR techniques were proposed. Unfortunately, there is still no methodology to support a systematic comprehension and discovery of evaluation biases in APR, which impedes the mitigation of such biases and threatens the evaluation of APR techniques. In this work, we propose to systematically understand existing evaluation biases by rigorously conducting the first systematic literature review on existing known biases, and systematically uncover new biases by building a taxonomy that categorizes evaluation biases. As a result, we identify 17 investigated biases and uncover a new bias in the usage of patch validation strategies. To validate this new bias, we devise and implement an executable framework APRConfig , based on which we evaluate three typical patch validation strategies with four representative heuristic-based and constraint-based APR techniques on three bug datasets. Overall, this paper distills 13 findings for bias understanding, discovery, and validation. The systematic exploration we performed and the open-source executable framework we proposed in this paper provide new insights as well as an infrastructure for future exploration and mitigation of biases in APR evaluation.
更多
查看译文
关键词
automated program repair,uncovering evaluation biases
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要