Program Repair with Minimal Edits Using CodeT5
This addresses the issue of user comprehension difficulties in program repair for programmers, though it is incremental as it builds on existing models like CodeT5.
The paper tackles the problem of language models generating program repairs that differ too much from the original code, proposing a fine-tuned CodeT5 approach to suggest correct programs with minimal edits. It achieves a pass@100 of 91.95% and an average edit distance of 6.84, showing effectiveness in solving introductory programming problems.
Programmers often struggle to identify and fix bugs in their programs. In recent years, many language models (LMs) have been proposed to fix erroneous programs and support error recovery. However, the LMs tend to generate solutions that differ from the original input programs. This leads to potential comprehension difficulties for users. In this paper, we propose an approach to suggest a correct program with minimal repair edits using CodeT5. We fine-tune a pre-trained CodeT5 on code pairs of wrong and correct programs and evaluate its performance with several baseline models. The experimental results show that the fine-tuned CodeT5 achieves a pass@100 of 91.95% and an average edit distance of the most similar correct program of 6.84, which indicates that at least one correct program can be suggested by generating 100 candidate programs. We demonstrate the effectiveness of LMs in suggesting program repair with minimal edits for solving introductory programming problems.