SEMar 31

How and Why Agents Can Identify Bug-Introducing Commits

arXiv:2603.2937871.63 citationsh-index: 4

AI Analysis

This addresses a long-standing bottleneck in software engineering for developers and researchers, offering a substantial improvement over existing methods.

The paper tackles the problem of identifying bug-introducing commits from fix commits in software repositories, achieving an F1-score increase from 0.64 to 0.81 on the Linux kernel dataset. It reveals that LLM-based agents succeed by deriving greppable patterns from fix commits to search candidate sets effectively.

Åliwerski, Zimmermann, and Zeller (SZZ) just won the 2026 ACM SIGSOFT Impact Award for asking: When do changes induce fixes? Their paper from 2005 served as the foundation for a wide array of approaches aimed at identifying bug-introducing changes (or commits) from fix commits in software repositories. But even after two decades of progress, the best-performing approach from 2025 yields a modest increase of 10 percentage points in F1-score on the most popular Linux kernel dataset. In this paper, we uncover how and why LLM-based agents can substantially advance the state-of-the-art in identifying bug-introducing commits from fix commits. We propose a simple agentic workflow based on searching a set of candidate commits and find that it raises the F1-score from 0.64 to 0.81 on the most popular Linux kernel dataset, a bigger jump than between the original 2005 method (0.54) and the previous SOTA (0.64). We also uncover why agents are so successful: They derive short greppable patterns from the fix commit diff and message and use them to effectively search and find bug-introducing commits in large candidate sets. Finally, we also discuss how these insights might enable further progress in bug detection, root cause understanding, and repair.

View on arXiv PDF

Similar