Towards automatically generating block comments for code snippets

Yuan Huang, Shaohao Huang, Huanchao Chen, Xiangping Chen, Zibin Zheng, Xiapu Luo, Nan Jia, Xinyu Hu, Xiaocong Zhou

Research output: Journal article publicationJournal articleAcademic researchpeer-review

3 Citations (Scopus)

Abstract

Code commenting is a common programming practice of practical importance to help developers review and comprehend source code. There are two main types of code comments for a method: header comments that summarize the method functionality located before a method, and block comments that describe the functionality of the code snippets within a method. Inspired by the effectiveness of deep learning techniques in the NLP field, many studies focus on using the machine translation model to automatically generate comment for the source code. Because the data set of block comments is difficult to collect, current studies focus more on the automatic generation of header comments than that of block comments. However, block comments are important for program comprehension due to their explanation role for the code snippets in a method. To fill the gap, we have proposed an approach that combines heuristic rules and learning-based method to collect a large number of comment-code pairs from 1,032 open source projects in our previous study. In this paper, we propose a reinforcement learning-based method, RL-BlockCom, to automatically generate block comments for code snippets based on the collected comment-code pairs. Specifically, we utilize the abstract syntax tree (i.e., AST) of a code snippet to generate a token sequence with a statement-based traversal way. Then we propose a composite learning model, which combines the actor-critic algorithm of reinforcement learning with the encoder-decoder algorithm, to generate block comments. On the data set of the comment-code pairs, the BLEU-4 score of our method is 24.28, which outperforms the baselines and state-of-the-art in comment generation.

Original languageEnglish
Article number106373
Pages (from-to)1-12
JournalInformation and Software Technology
Volume127
DOIs
Publication statusPublished - Nov 2020

Keywords

  • Automatic comment generation
  • Code comment scope
  • Reinforcement learning
  • Source code summarization

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Computer Science Applications

Cite this