Enhancing Code Language Models for Program Repair by Curricular Fine-tuning Framework

Sichong Hao, Xianjun Shi,Hongwei Liu,Yanjun Shu

2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME(2023)

引用 0|浏览3
暂无评分
摘要
Automated program repair (APR) is a key technique for enhancing software maintenance productivity by fixing buggy code automatically. Recently, large code language models (CLMs) have exhibited impressive capabilities in code generation. However, for complex programming tasks, especially program repair, the success rate of CLMs is still low. One of the reasons is that CLMs are typically developed for general purpose and their potential for APR applications has yet to be fully explored. In this paper, we propose APRFiT, a general curricular fine-tuning framework that improves the success rate of CLMs for APR. Firstly, APRFiT generates syntactically diverse but semantically equivalent bug-fixing programs via code augmentation operators to enrich the diversity of bug-fixing dataset automatically. Secondly, APRFiT designs a curriculum learning-based mechanism to help CLMs develop deep understanding of program semantics from these augmented bug-fixing code variants and improve the effectiveness of fine-tuning for APR tasks. We implement APRFiT on different CLMs and evaluate them on Bugs2Fix small and medium datasets. The extensive experiments demonstrate that, the existing CLMs implemented with APRFiT substantially outperform original models and generate 2.5 to 14.5 percent more correct patches than baselines both effectively and efficiently.
更多
查看译文
关键词
Program Repair,Large Language Models of Code,Curriculum Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要