Inside the Scaffold: A Source-Code Taxonomy of Coding Agent Architectures

arXiv:2604.0351574.71 citationsh-index: 1Has Code

AI Analysis

For researchers and practitioners building or studying coding agents, this taxonomy provides a grounded, reusable reference to understand and compare scaffold architectures, addressing the gap between abstract capability classifications and actual implementation details.

The paper presents a source-code-level architectural taxonomy of 13 open-source LLM-based coding agents, analyzing 12 dimensions across control architecture, tool/environment interface, and resource management. It finds that scaffolds resist discrete classification, with 11 of 13 agents composing multiple loop primitives, and identifies convergent and divergent design patterns.

LLM-based coding agents can localize bugs, generate patches, and run tests with diminishing human oversight, yet the scaffolding code that surrounds the language model (the control loop, tool definitions, state management, and context strategy) remains poorly understood. Existing surveys classify agents by abstract capabilities (tool use, planning, reflection) that cannot distinguish between architecturally distinct systems, and trajectory studies observe what agents do without examining the scaffold code that determines why. This paper presents a source-code-level architectural taxonomy derived from analysis of 13 open-source coding agent scaffolds at pinned commit hashes. Each agent is characterized across 12 dimensions organized into three layers: control architecture, tool and environment interface, and resource management. The analysis reveals that scaffold architectures resist discrete classification: control strategies range from fixed pipelines to Monte Carlo Tree Search, tool counts range from 0 to 37, and context compaction spans seven distinct strategies. Five loop primitives (ReAct, generate-test-repair, plan-execute, multi-attempt retry, tree search) function as composable building blocks that agents layer in different combinations; 11 of 13 agents compose multiple primitives rather than relying on a single control structure. Dimensions converge where external constraints dominate (tool capability categories, edit formats, execution isolation) and diverge where open design questions remain (context compaction, state management, multi-model routing). All taxonomic claims are grounded in file paths and line numbers, providing a reusable reference for researchers studying agent behavior and practitioners designing new scaffolds.

View on arXiv PDF Code

Similar