On the Generalizability of Deep Learning-based Code Completion Across Programming Language Versions
arxiv(2024)
摘要
Code completion is a key feature of Integrated Development Environments
(IDEs), aimed at predicting the next tokens a developer is likely to write,
helping them write code faster and with less effort. Modern code completion
approaches are often powered by deep learning (DL) models. However, the swift
evolution of programming languages poses a critical challenge to the
performance of DL-based code completion models: Can these models generalize
across different language versions? This paper delves into such a question. In
particular, we assess the capabilities of a state-of-the-art model, CodeT5, to
generalize across nine different Java versions, ranging from Java 2 to Java 17,
while being exclusively trained on Java 8 code. Our evaluation spans three
completion scenarios, namely, predicting tokens, constructs (e.g., the
condition of an if statement) and entire code blocks. The results of our study
reveal a noticeable disparity among language versions, with the worst
performance being obtained in Java 2 and 17 - the most far apart versions
compared to Java 8. We investigate possible causes for the performance
degradation and show that the adoption of a limited version-specific
fine-tuning can partially alleviate the problem. Our work raises awareness on
the importance of continuous model refinement, and it can inform the design of
alternatives to make code completion models more robust to language evolution.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要