Fine-Tuning GPT-2 to Patch Programs, Is It Worth It?

International Conference on Computational Science and Its Applications (ICCSA)(2022)

引用 3|浏览3
暂无评分
摘要
The application of Artificial Intelligence (AI) in the Software Engineering (SE) field is always a bit delayed compared to state-of-the-art research results. While the Generative Pre-trained Transformer (GPT-2) model was published in 2018, only a few recent works used it for SE tasks. One of such tasks is Automated Program Repair (APR), where the applied technique should find a fix to software bugs without human intervention. One problem emerges here: the creation of proper training data is resource-intensive and requires several hours of additional work from researchers. The sole reason for it is that training a model to repair programs automatically requires both the buggy program and the fixed one on large scale and presumably in an already pre-processed form. There are currently few such databases, so teaching and fine-tuning models is not an easy task. In this work, we wanted to investigate how the GPT-2 model performs when it is not fine-tuned for the APR task, compared to when it is fine-tuned. From previous work, we already know that the GPT-2 model can automatically generate patches for buggy programs, although the literature lacks studies where no fine-tuning has taken place. For the sake of the experiment we evaluated the GPT-2 model out-of-the-box and also fine-tuned it before the evaluation on 1559 JavaSript code snippets. Based on our results we can conclude that although the fine-tuned model was able to learn how to write syntactically correct source code almost on every attempt, the non-fine-tuned model lacked some of these positive features.
更多
查看译文
关键词
Automated Program Repair,Machine learning,JavaScript,Code refinement,GPT-2,Fine-tune
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要