SEMar 30

Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild

arXiv:2603.2859253.72 citationsh-index: 12
Predicted impact top 46% in SE · last 90 daysOriginality Incremental advance
AI Analysis

This addresses the risk of accumulating technical debt for software developers using AI coding assistants, highlighting a need for better quality assurance, though it is incremental as it builds on prior controlled studies.

The study tackled the problem of technical debt from AI-generated code in real-world software projects by analyzing 304,362 AI-authored commits, finding that 24.2% of introduced issues persist long-term, indicating significant maintenance costs.

AI coding assistants are now widely used in software development. Software developers increasingly integrate AI-generated code into their codebases to improve productivity. Prior studies have shown that AI-generated code may contain code quality issues under controlled settings. However, we still know little about the real-world impact of AI-generated code on software quality and maintenance after it is introduced into production repositories. In other words, it remains unclear whether such issues are quickly fixed or persist and accumulate over time as technical debt. In this paper, we conduct a large-scale empirical study on the technical debt introduced by AI coding assistants in the wild. To achieve that, we built a dataset of 304,362 verified AI-authored commits from 6,275 GitHub repositories, covering five widely used AI coding assistants. For each commit, we run static analysis before and after the change to precisely attribute which code smells, bugs, and security issues the AI introduced. We then track each introduced issue from the introducing commit to the latest repository revision to study its lifecycle. Our results show that we identified 484,606 distinct issues, and that code smells are by far the most common type, accounting for 89.1% of all issues. We also find that more than 15% of commits from every AI coding assistant introduce at least one issue, although the rates vary across tools. More importantly, 24.2% of tracked AI-introduced issues still survive at the latest revision of the repository. These findings show that AI-generated code can introduce long-term maintenance costs into real software projects and highlight the need for stronger quality assurance in AI-assisted development.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes