CYCLE: Learning to Self-Refine the Code Generation
Proceedings of the ACM on Programming Languages(2024)
摘要
Pre-trained code language models have achieved promising performance in code
generation and improved the programming efficiency of human developers.
However, their self-refinement capability is typically overlooked by the
existing evaluations of code LMs, which focus only on the accuracy of the
one-time prediction. For the cases when code LMs fail to implement the correct
program, developers actually find it hard to debug and fix the faulty
prediction since it is not written by the developers themselves. Unfortunately,
our study reveals that code LMs cannot efficiently self-refine their faulty
generations as well.
In this paper, we propose CYCLE framework, learning to self-refine the faulty
generation according to the available feedback, such as the execution results
reported by the test suites. We evaluate CYCLE on three popular code generation
benchmarks, HumanEval, MBPP, and APPS. The results reveal that CYCLE
successfully maintains, sometimes improves, the quality of one-time code
generation, while significantly improving the self-refinement capability of
code LMs. We implement four variants of CYCLE with varied numbers of parameters
across 350M, 1B, 2B, and 3B, and the experiments show that CYCLE consistently
boosts the code generation performance, by up to 63.5
varied model sizes. We also notice that CYCLE outperforms code LMs that have
3× more parameters in self-refinement.
更多查看译文
关键词
Code Generation,Code Language Models,Iterative Programming,Source Code Modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要