LGSEMay 26, 2021

Self-Supervised Bug Detection and Repair

arXiv:2105.12787v3138 citationsHas Code
Originality Highly original
AI Analysis

This addresses the problem of bug detection and repair for software developers, offering a novel self-supervised method that reduces reliance on annotated data.

The paper tackles the challenge of training machine learning-based program analyses without large annotated datasets by introducing BugLab, a self-supervised approach for bug detection and repair. It improves by up to 30% over baselines on a test dataset of 2374 real-life bugs and finds 19 previously unknown bugs in open-source software.

Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a detector model that learns to detect and repair bugs in code, (2) a selector model that learns to create buggy code for the detector to use as training data. A Python implementation of BugLab improves by up to 30% upon baseline methods on a test dataset of 2374 real-life bugs and finds 19 previously unknown bugs in open-source software.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes