SE AIApr 8, 2024

AutoCodeRover: Autonomous Program Improvement

Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury

arXiv:2404.05427v340.3312 citationsh-index: 18Has CodeISSTA

Originality Incremental advance

AI Analysis

This addresses the need for autonomous software engineering tools to reduce manual effort in bug fixing and feature additions, though it is incremental by building on existing LLM and code search methods.

The paper tackles the problem of automating program improvement for software maintenance and evolution by proposing AutoCodeRover, which combines LLMs with structured code search and fault localization to solve GitHub issues, achieving a 19% success rate on SWE-bench-lite at an average cost of $0.43 per issue.

Researchers have made significant progress in automating the software development process in the past decades. Recent progress in Large Language Models (LLMs) has significantly impacted the development process, where developers can use LLM-based programming assistants to achieve automated coding. Nevertheless, software engineering involves the process of program improvement apart from coding, specifically to enable software maintenance (e.g. bug fixing) and software evolution (e.g. feature additions). In this paper, we propose an automated approach for solving GitHub issues to autonomously achieve program improvement. In our approach called AutoCodeRover, LLMs are combined with sophisticated code search capabilities, ultimately leading to a program modification or patch. In contrast to recent LLM agent approaches from AI researchers and practitioners, our outlook is more software engineering oriented. We work on a program representation (abstract syntax tree) as opposed to viewing a software project as a mere collection of files. Our code search exploits the program structure in the form of classes/methods to enhance LLM's understanding of the issue's root cause, and effectively retrieve a context via iterative search. The use of spectrum-based fault localization using tests, further sharpens the context, as long as a test-suite is available. Experiments on SWE-bench-lite (300 real-life GitHub issues) show increased efficacy in solving GitHub issues (19% on SWE-bench-lite), which is higher than the efficacy of the recently reported SWE-agent. In addition, AutoCodeRover achieved this efficacy with significantly lower cost (on average, $0.43 USD), compared to other baselines. We posit that our workflow enables autonomous software engineering, where, in future, auto-generated code from LLMs can be autonomously improved.

View on arXiv PDF Code

Similar