SEAICLMay 24, 2025

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

Cambridge
arXiv:2505.18646v112 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of automating workflow design for complex coding tasks, enabling more adaptive multi-agent systems, though it appears incremental as it builds on existing multi-agent approaches.

The paper tackles the limitation of manually designed agentic workflows in multi-agent systems for code generation by proposing SEW, a self-evolving framework that automatically generates and optimizes these workflows, achieving up to a 33% improvement on the LiveCodeBench dataset compared to using the backbone LLM alone.

Large Language Models (LLMs) have demonstrated effectiveness in code generation tasks. To enable LLMs to address more complex coding challenges, existing research has focused on crafting multi-agent systems with agentic workflows, where complex coding tasks are decomposed into sub-tasks, assigned to specialized agents. Despite their effectiveness, current approaches heavily rely on hand-crafted agentic workflows, with both agent topologies and prompts manually designed, which limits their ability to automatically adapt to different types of coding problems. To address these limitations and enable automated workflow design, we propose \textbf{S}elf-\textbf{E}volving \textbf{W}orkflow (\textbf{SEW}), a novel self-evolving framework that automatically generates and optimises multi-agent workflows. Extensive experiments on three coding benchmark datasets, including the challenging LiveCodeBench, demonstrate that our SEW can automatically design agentic workflows and optimise them through self-evolution, bringing up to 33\% improvement on LiveCodeBench compared to using the backbone LLM only. Furthermore, by investigating different representation schemes of workflow, we provide insights into the optimal way to encode workflow information with text.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes