SEAILGJan 30

On the Impact of Code Comments for Automated Bug-Fixing: An Empirical Study

arXiv:2601.23059v11 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses a practical problem for software engineers by challenging the common practice of removing comments in bug-fixing tools, though it is incremental as it builds on existing LLM methods.

The study investigated whether code comments improve automated bug-fixing with LLMs, finding that comments can increase accuracy by up to threefold when present during both training and inference, without harming performance when absent.

Large Language Models (LLMs) are increasingly relevant in Software Engineering research and practice, with Automated Bug Fixing (ABF) being one of their key applications. ABF involves transforming a buggy method into its fixed equivalent. A common preprocessing step in ABF involves removing comments from code prior to training. However, we hypothesize that comments may play a critical role in fixing certain types of bugs by providing valuable design and implementation insights. In this study, we investigate how the presence or absence of comments, both during training and at inference time, impacts the bug-fixing capabilities of LLMs. We conduct an empirical evaluation comparing two model families, each evaluated under all combinations of training and inference conditions (with and without comments), and thereby revisiting the common practice of removing comments during training. To address the limited availability of comments in state-of-the-art datasets, we use an LLM to automatically generate comments for methods lacking them. Our findings show that comments improve ABF accuracy by up to threefold when present in both phases, while training with comments does not degrade performance when instances lack them. Additionally, an interpretability analysis identifies that comments detailing method implementation are particularly effective in aiding LLMs to fix bugs accurately.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes