Towards automatically generating block comments for code snippets

Information and Software Technology(2020)

引用 21|浏览113
暂无评分
摘要
Code commenting is a common programming practice of practical importance to help developers review and comprehend source code. There are two main types of code comments for a method: header comments that summarize the method functionality located before a method, and block comments that describe the functionality of the code snippets within a method. Inspired by the effectiveness of deep learning techniques in the NLP field, many studies focus on using the machine translation model to automatically generate comment for the source code. Because the data set of block comments is difficult to collect, current studies focus more on the automatic generation of header comments than that of block comments. However, block comments are important for program comprehension due to their explanation role for the code snippets in a method. To fill the gap, we have proposed an approach that combines heuristic rules and learning-based method to collect a large number of comment-code pairs from 1,032 open source projects in our previous study. In this paper, we propose a reinforcement learning-based method, RL-BlockCom, to automatically generate block comments for code snippets based on the collected comment-code pairs. Specifically, we utilize the abstract syntax tree (i.e., AST) of a code snippet to generate a token sequence with a statement-based traversal way. Then we propose a composite learning model, which combines the actor-critic algorithm of reinforcement learning with the encoder-decoder algorithm, to generate block comments. On the data set of the comment-code pairs, the BLEU-4 score of our method is 24.28, which outperforms the baselines and state-of-the-art in comment generation.
更多
查看译文
关键词
Automatic comment generation,Source code summarization,Code comment scope,Reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要