SEAIJun 2

FLARE: Fine-Grained Diagnostic Feedback for LLM Code Refinement

arXiv:2606.0385278.1h-index: 5
AI Analysis

For developers using LLMs for code generation, FLARE provides fine-grained bug localization feedback that significantly improves code refinement accuracy.

FLARE introduces a lightweight diagnostic model that predicts line-level bug locations to guide LLM code refinement, achieving absolute improvements of 1.72% to 7.42% over baselines on LiveCodeBench and BigCodeBench, with further gains from candidate search.

Large language models often generate code with bugs. Existing methods rely on feedback signals such as test failures and self-critiques to iteratively refine the generated code. Such signals are either too coarse-grained or too high-level, which is not sufficient to inform the model where to fix the bug. In this work, we present Flare, an iterative framework with a lightweight diagnostic model that predicts line-level suspiciousness signals for bug localization and code refinement. Given the inherent uncertainty of diagnostic predictions, Flare searches over the top-k suspicious regions and selects the best candidate according to execution outcomes. Experiments on LiveCodeBench and BigCodeBench with five base LLMs show that, even without candidate search (k=1), Flare outperforms the strongest baseline with an absolute improvement from 1.72% to 7.42%. Furthermore, searching over 10 candidates yields an average improvement of 8.50% compared with no candidate search. When evaluated in isolation, our lightweight diagnostic model achieves the best performance compared with recent fault localization methods, demonstrating that it can provide reliable fine-grained guidance for code refinement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes