SEMar 31

How and Why Agents Can Identify Bug-Introducing Commits

arXiv:2603.2937871.63 citationsh-index: 4
AI Analysis

This addresses a long-standing bottleneck in software engineering for developers and researchers, offering a substantial improvement over existing methods.

The paper tackles the problem of identifying bug-introducing commits from fix commits in software repositories, achieving an F1-score increase from 0.64 to 0.81 on the Linux kernel dataset. It reveals that LLM-based agents succeed by deriving greppable patterns from fix commits to search candidate sets effectively.

Śliwerski, Zimmermann, and Zeller (SZZ) just won the 2026 ACM SIGSOFT Impact Award for asking: When do changes induce fixes? Their paper from 2005 served as the foundation for a wide array of approaches aimed at identifying bug-introducing changes (or commits) from fix commits in software repositories. But even after two decades of progress, the best-performing approach from 2025 yields a modest increase of 10 percentage points in F1-score on the most popular Linux kernel dataset. In this paper, we uncover how and why LLM-based agents can substantially advance the state-of-the-art in identifying bug-introducing commits from fix commits. We propose a simple agentic workflow based on searching a set of candidate commits and find that it raises the F1-score from 0.64 to 0.81 on the most popular Linux kernel dataset, a bigger jump than between the original 2005 method (0.54) and the previous SOTA (0.64). We also uncover why agents are so successful: They derive short greppable patterns from the fix commit diff and message and use them to effectively search and find bug-introducing commits in large candidate sets. Finally, we also discuss how these insights might enable further progress in bug detection, root cause understanding, and repair.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes