HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale
This addresses the need for versatile AI agents in software development, offering a scalable solution for coding tasks across multiple languages, though it builds incrementally on existing multi-agent frameworks.
The paper tackles the challenge of creating a generalist multi-agent system for diverse software engineering tasks by introducing HyperAgent, which outperforms robust baselines on benchmarks like SWE-Bench, RepoExec, and Defects4J.
Large Language Models (LLMs) have revolutionized software engineering (SE), showcasing remarkable proficiency in various coding tasks. Despite recent advancements that have enabled the creation of autonomous software agents utilizing LLMs for end-to-end development tasks, these systems are typically designed for specific SE functions. We introduce HyperAgent, an innovative generalist multi-agent system designed to tackle a wide range of SE tasks across different programming languages by mimicking the workflows of human developers. HyperAgent features four specialized agents-Planner, Navigator, Code Editor, and Executor-capable of handling the entire lifecycle of SE tasks, from initial planning to final verification. HyperAgent sets new benchmarks in diverse SE tasks, including GitHub issue resolution on the renowned SWE-Bench benchmark, outperforming robust baselines. Furthermore, HyperAgent demonstrates exceptional performance in repository-level code generation (RepoExec) and fault localization and program repair (Defects4J), often surpassing state-of-the-art baselines.