CODESTRUCT: Code Agents over Structured Action Spaces
This addresses reliability and efficiency issues for developers using code agents, though it is incremental as it builds on existing agent frameworks with a structured interface.
The paper tackles the problem of LLM-based code agents failing due to brittle string matching by reframing codebases as structured action spaces, resulting in improvements such as Pass@1 accuracy gains of 1.2-5.0% and token consumption reductions of 12-38% on SWE-Bench Verified.
LLM-based code agents treat repositories as unstructured text, applying edits through brittle string matching that frequently fails due to formatting drift or ambiguous patterns. We propose reframing the codebase as a structured action space where agents operate on named AST entities rather than text spans. Our framework, CODESTRUCT, provides readCode for retrieving complete syntactic units and editCode for applying syntax-validated transformations to semantic program elements. Evaluated on SWE-Bench Verified across six LLMs, CODESTRUCT improves Pass@1 accuracy by 1.2-5.0% while reducing token consumption by 12-38% for most models. Models that frequently fail to produce valid patches under text-based interfaces benefit most: GPT-5-nano improves by 20.8% as empty-patch failures drop from 46.6% to 7.2%. On CodeAssistBench, we observe consistent accuracy gains (+0.8-4.4%) with cost reductions up to 33%. Our results show that structure-aware interfaces offer a more reliable foundation for code agents.