SEMar 28

A Large-Scale Comprehensive Measurement of AI-Generated Code in Real-World Repositories A Large-Scale Comprehensive Measurement of AI-Generated Code in Real-World Repositories

arXiv:2603.2713064.1h-index: 18
Predicted impact top 32% in SE · last 90 daysOriginality Synthesis-oriented
AI Analysis

For software engineering researchers and practitioners, this work provides a comprehensive understanding of AI-generated code's characteristics in real-world settings, addressing the gap of small-scale controlled evaluations.

This paper presents a large-scale empirical study of AI-generated code in real-world repositories, analyzing code-level and commit-level metrics. The study reveals differences between AI-generated and human-written code and how AI assistance affects development practices.

Large language models (LLMs) are rapidly transforming software engineering by enabling developers to generate code ranging from small snippets to entire projects. As AI-generated code becomes increasingly integrated into real-world systems, understanding its characteristics and impact is critical. However, prior work primarily focuses on small-scale, controlled evaluations and lacks comprehensive analysis in real-world settings. In this paper, we present a large-scale empirical study of AI-generated code in real-world repositories. We analyze both code-level metrics (\eg complexity, structure, and defect-related indicators) and commit-level characteristics (\eg commit size, frequency, and post-commit stability). To enable this study, we develop heuristic filter with LLM classification to identify AI-generated code and construct a large dataset. Our results provide new insights into how AI-generated code differs from human-written code and how AI assistance influences development practices. These findings contribute to a deeper understanding of the practical implications of AI-assisted programming.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes