SEMar 30

Beyond Localization: Recoverable Headroom and Residual Frontier in Repository-Level RAG-APR

Pengtao Zhao, Boyang Yang, Bach Le, Feng Liu, Haoye Tian

arXiv:2603.2906776.4h-index: 5

AI Analysis

For researchers in automated program repair, this work provides a systematic analysis of the diminishing returns of stronger localization and identifies residual frontiers, though the findings are incremental and protocol-specific.

The paper investigates post-localization levers in repository-level automated program repair (APR) on SWE-bench Lite, finding that while Oracle Localization improves all three tested systems, success rates remain below 50%, and prompt-level fusion adds only 6 solved instances beyond the native union of three systems.

Repository-level automated program repair (APR) increasingly treats stronger localization as the main path to better repair. We ask a more targeted question: once localization is strengthened, which post-localization levers still provide recoverable gains, which are bounded within our protocol, and what residual frontier remains? We study this question on SWE-bench Lite with three representative repository-level RAG-APR paradigms, Agentless, KGCompass, and ExpeRepair. Our protocol combines Oracle Localization, within-pool Best-of-K, fixed-interface added context probes with per-condition same-token filler controls and same-repository hard negatives, and a common-wrapper oracle check. Oracle Localization improves all three systems, but Oracle success still stays below 50%. Extra candidate diversity still helps inside the sampled 10-patch pools, but that headroom saturates quickly. Under the two fixed interfaces, most informative added context conditions still outperform their own matched controls. The common-wrapper check shows different system responses: under a common wrapper, gains remain large for KGCompass and ExpeRepair, while Agentless changes more with builder choice. Prompt-level fusion still leaves a large residual frontier: the best fixed probe adds only 6 solved instances beyond the native three-system Solved@10 union. Overall, stronger localization, bounded search, evidence quality, and interface design all shape repository-level repair outcomes.

View on arXiv PDF

Similar