Are NLP Metrics Suitable for Evaluating Generated Code?

Riku Takaichi,Yoshiki Higo,Shinsuke Matsumoto,Shinji Kusumoto,Toshiyuki Kurabayashi,Hiroyuki Kirinuki,Haruto Tanno

PROFES（2022）

引用 0|浏览14

暂无评分

摘要

Code generation is a technique that generates program source code without human intervention. There has been much research on automated methods for writing code, such as code generation. However, many techniques are still in their infancy and often generate syntactically incorrect code. Therefore, automated metrics used in natural language processing (NLP) are occasionally used to evaluate existing techniques in code generation. At present, it is unclear which metrics in NLP are more suitable than others for evaluating generated codes. In this study, we clarify which NLP metrics are applicable to syntactically incorrect code and suitable for the evaluation of techniques that automatically generate codes. Our results show that METEOR is the best of the automated metrics compared in this study.

查看译文

关键词

Automated metric,Code generation,Deep learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要