SEMar 14

Do AI Agents Really Improve Code Readability?

Kyogo Horikawa, Kosei Horikawa, Yutaro Kashiwa, Hidetake Uwano, Hajimu Iida

arXiv:2603.137236.1h-index: 12

Predicted impact top 78% in SE · last 90 daysOriginality Synthesis-oriented

AI Analysis

This addresses the problem of unclear impact of AI agents on code readability for software developers and maintainers, revealing that such refactoring can harm other quality aspects, which is an incremental finding.

This study investigated whether AI agent-based refactoring improves code readability by analyzing 403 commits from the AIDev dataset, finding that AI agents primarily targeted logic complexity (42.4%) and documentation (24.2%), but readability-focused commits often degraded traditional quality metrics like the Maintainability Index (decreased in 56.1%) and Cyclomatic Complexity (increased in 42.7%).

Code readability is fundamental to software quality and maintainability. Poor readability extends development time, increases bug-inducing risks, and contributes to technical debt. With the rapid advancement of Large Language Models, AI agent-based approaches have emerged as a promising paradigm for automated refactoring, capable of decomposing complex tasks through autonomous planning and execution. While prior studies have examined refactoring by AI agents, these analyses cover all forms of refactoring, including performance optimization and structural improvement. As a result, the extent to which AI agent-based refactoring specifically improves code readability remains unclear. This study investigates the impact of AI agent-based refactoring on code readability. We extracted commits containing readability-related keywords from the AIDev dataset and analyzed changes in readability metrics before and after each commit, covering 403 commits evaluated using multiple quantitative metrics. Our results indicate that AI agents primarily target logic complexity (42.4%) and documentation improvements (24.2%) rather than surface-level aspects like naming conventions or formatting. However, contrary to expectations, readability-focused commits often degraded traditional quality metrics: the Maintainability Index decreased in 56.1% of commits, while Cyclomatic Complexity increased in 42.7%.

View on arXiv PDF

Similar