SE AI MAJun 19, 2025

SemAgent: A Semantics Aware Program Repair Agent

Anvith Pabba, Alex Mathai, Anindya Chakraborty, Baishakhi Ray

arXiv:2506.16650v17 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses the problem of generating robust and semantically consistent patches in automated program repair for software engineering, representing a strong incremental improvement over existing agentic systems.

The paper tackles the problem of automated program repair systems overfitting to specific issues by introducing SemAgent, a workflow-based approach that leverages issue, code, and execution semantics to generate more complete patches. Their method achieves a 44.66% solve rate on SWEBench-Lite, beating other workflow-based approaches and showing a 7.66% absolute improvement over a baseline lacking deep semantic understanding.

Large Language Models (LLMs) have shown impressive capabilities in downstream software engineering tasks such as Automated Program Repair (APR). In particular, there has been a lot of research on repository-level issue-resolution benchmarks such as SWE-Bench. Although there has been significant progress on this topic, we notice that in the process of solving such issues, existing agentic systems tend to hyper-localize on immediately suspicious lines of code and fix them in isolation, without a deeper understanding of the issue semantics, code semantics, or execution semantics. Consequently, many existing systems generate patches that overfit to the user issue, even when a more general fix is preferable. To address this limitation, we introduce SemAgent, a novel workflow-based procedure that leverages issue, code, and execution semantics to generate patches that are complete - identifying and fixing all lines relevant to the issue. We achieve this through a novel pipeline that (a) leverages execution semantics to retrieve relevant context, (b) comprehends issue-semantics via generalized abstraction, (c) isolates code-semantics within the context of this abstraction, and (d) leverages this understanding in a two-stage architecture: a repair stage that proposes fine-grained fixes, followed by a reviewer stage that filters relevant fixes based on the inferred issue-semantics. Our evaluations show that our methodology achieves a solve rate of 44.66% on the SWEBench-Lite benchmark beating all other workflow-based approaches, and an absolute improvement of 7.66% compared to our baseline, which lacks such deep semantic understanding. We note that our approach performs particularly well on issues requiring multi-line reasoning (and editing) and edge-case handling, suggesting that incorporating issue and code semantics into APR pipelines can lead to robust and semantically consistent repairs.

View on arXiv PDF

Similar