SE AIMar 31

A Study on the Impact of Fault localization Granularity for Repository-Scale Code Repair Tasks

Joseph Townsend, Chandresh Pravin, Kwun Ho Ngan, Matthieu Parizy

arXiv:2604.0016714.3

Predicted impact top 40% in SE · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the problem of optimizing fault localization for developers and researchers in automated program repair, but it is incremental as it focuses on a specific aspect without improving state-of-the-art.

The study investigated how fault localization granularity affects automatic code repair at the repository scale, finding that function-level granularity yields the highest repair rate compared to line-level and file-level on the SWE-Bench-Mini dataset, though the ideal granularity may be task-dependent.

Automatic program repair can be a challenging task, especially when resolving complex issues at a repository-level, which often involves issue reproduction, fault localization, code repair, testing and validation. Issues of this scale can be commonly found in popular GitHub repositories or datasets that are derived from them. Some repository-level approaches separate localization and repair into distinct phases. Where this is the case, the fault localization approaches vary in terms of the granularity of localization. Where the impact of granularity is explored to some degree for smaller datasets, not all isolate this issue from the separate question of localization accuracy by testing code repair under the assumption of perfect fault localization. To the best of the authors' knowledge, no repository-scale studies have explicitly investigated granularity under this assumption, nor conducted a systematic empirical comparison of granularity levels in isolation. We propose a framework for performing such tests by modifying the localization phase of the Agentless framework to retrieve ground-truth localization data and include this as context in the prompt fed to the repair phase. We show that under this configuration and as a generalization over the SWE-Bench-Mini dataset, function-level granularity yields the highest repair rate against line-level and file-level. However, a deeper dive suggests that the ideal granularity may in fact be task dependent. This study is not intended to improve on the state-of-the-art, nor do we intend for results to be compared against any complete agentic frameworks. Rather, we present a proof of concept for investigating how fault localization may impact automatic code repair in repository-scale scenarios. We present preliminary findings to this end and encourage further research into this relationship between the two phases.

View on arXiv PDF

Similar