Conversational Automated Program Repair
This addresses the issue of inefficient and repetitive patch generation in automated program repair for developers, offering a novel method that is not incremental but introduces a new paradigm.
The paper tackles the problem of automated program repair (APR) by proposing conversational APR, a new paradigm that alternates between patch generation and validation using large language models (LLMs), resulting in improved performance over prior LLM-based APR approaches as demonstrated with evaluations on 10 different LLMs including ChatGPT.
Automated Program Repair (APR) can help developers automatically generate patches for bugs. Due to the impressive performance obtained using Large Pre-Trained Language Models (LLMs) on many code related tasks, researchers have started to directly use LLMs for APR. However, prior approaches simply repeatedly sample the LLM given the same constructed input/prompt created from the original buggy code, which not only leads to generating the same incorrect patches repeatedly but also miss the critical information in testcases. To address these limitations, we propose conversational APR, a new paradigm for program repair that alternates between patch generation and validation in a conversational manner. In conversational APR, we iteratively build the input to the model by combining previously generated patches with validation feedback. As such, we leverage the long-term context window of LLMs to not only avoid generating previously incorrect patches but also incorporate validation feedback to help the model understand the semantic meaning of the program under test. We evaluate 10 different LLM including the newly developed ChatGPT model to demonstrate the improvement of conversational APR over the prior LLM for APR approach.