CommtPst: Deep learning source code for commenting positions prediction

Yuan Huang, Xinyu Hu, Nan Jia, Xiangping Chen, Zibin Zheng, Xiapu Luo

Research output: Journal article publicationJournal articleAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Existing techniques for automatic code commenting assume that the code snippet to be commented has been identified, thus requiring users to provide the code snippet in advance. A smarter commenting approach is desired to first self-determine where to comment in a given source code and then generate comments for the code snippets that need comments. To achieve the first step of this goal, we propose a novel method, CommtPst, to automatically find the appropriate commenting positions in the source code. Since commenting is closely related to the code syntax and semantics, we adopt neural language model (word embeddings) to capture the code semantic information, and analyze the abstract syntax trees to capture code syntactic information. Then, we employ LSTM (long short term memory) to model the long-term logical dependency of code statements over the fused semantic and syntactic information and learn the commenting patterns on the code sequence. We evaluated CommtPst using large data sets from dozens of open-source software systems in GitHub. The experimental results show that the precision, recall and F-Measure values achieved by CommtPst are 0.792, 0.602 and 0.684, respectively, which outperforms the traditional machine learning method with 11.4% improvement on F-measure.

Original languageEnglish
Article number110754
Pages (from-to)1-14
JournalJournal of Systems and Software
Volume170
DOIs
Publication statusPublished - Dec 2020

Keywords

  • Code semantics
  • Code syntax
  • Comment generation
  • Comment position
  • LSTM

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Cite this