SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics

He Ye, Matias Martinez, Xiapu Luo, Tao Zhang, Martin Monperrus

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Learning-based program repair has achieved good results in a recent series of papers. Yet, we observe that the related work fails to repair some bugs because of a lack of knowledge about 1) the application domain of the program being repaired, and 2) the fault type being repaired. In this paper, we solve both problems by changing the learning paradigm from supervised training to self-supervised training in an approach called SelfAPR. First, SelfAPR generates training samples on disk by perturbing a previous version of the program being repaired, enforcing the neural model to capture project-specific knowledge. This is different from the previous work based on mined past commits. Second, SelfAPR executes all training samples and extracts and encodes test execution diagnostics into the input representation, steering the neural model to fix the kind of fault. This is different from the existing studies that only consider static source code as input. We implement SelfAPR and evaluate it in a systematic manner. We generate 1 039 873 training samples obtained by perturbing 17 open-source projects. We evaluate SelfAPR on 818 bugs from Defects4J, SelfAPR correctly repairs 110 of them, outperforming all the supervised learning repair approaches.
Original languageEnglish
Title of host publicationProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE)
PublisherAssociation for Computing Machinery (ACM)
Pages1-13
Number of pages2006
ISBN (Electronic)10.1145/3551349
ISBN (Print)9781450394758
Publication statusPublished - 5 Jan 2023
Event37th IEEE/ACM International Conference on Automated Software Engineering (ASE) - Ann Arbor, United States
Duration: 26 Sept 20221 Oct 2022
https://www.aconf.org/conf_181212.html

Conference

Conference37th IEEE/ACM International Conference on Automated Software Engineering (ASE)
Country/TerritoryUnited States
CityAnn Arbor
Period26/09/221/10/22
Internet address

Fingerprint

Dive into the research topics of 'SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics'. Together they form a unique fingerprint.

Cite this