SEAIMAOct 22, 2024

Self-Evolving Multi-Agent Collaboration Networks for Software Development

arXiv:2410.16946v169 citationsh-index: 10ICLR
Originality Highly original
AI Analysis

This work addresses the problem of adaptability in automated software development for developers, though it appears incremental as it builds on existing multi-agent collaboration systems.

The authors tackled the limited adaptability of LLM-driven multi-agent collaboration systems in software development by introducing EvoMAC, a self-evolving paradigm that outperforms previous state-of-the-art methods on both software-level and function-level benchmarks, achieving superior coding capabilities.

LLM-driven multi-agent collaboration (MAC) systems have demonstrated impressive capabilities in automatic software development at the function level. However, their heavy reliance on human design limits their adaptability to the diverse demands of real-world software development. To address this limitation, we introduce EvoMAC, a novel self-evolving paradigm for MAC networks. Inspired by traditional neural network training, EvoMAC obtains text-based environmental feedback by verifying the MAC network's output against a target proxy and leverages a novel textual backpropagation to update the network. To extend coding capabilities beyond function-level tasks to more challenging software-level development, we further propose rSDE-Bench, a requirement-oriented software development benchmark, which features complex and diverse software requirements along with automatic evaluation of requirement correctness. Our experiments show that: i) The automatic requirement-aware evaluation in rSDE-Bench closely aligns with human evaluations, validating its reliability as a software-level coding benchmark. ii) EvoMAC outperforms previous SOTA methods on both the software-level rSDE-Bench and the function-level HumanEval benchmarks, reflecting its superior coding capabilities. The benchmark can be downloaded at https://yuzhu-cai.github.io/rSDE-Bench/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes