SEAISep 17, 2025

GitHub's Copilot Code Review: Can AI Spot Security Flaws Before You Commit?

arXiv:2509.13650v11 citationsh-index: 18Has Code
Originality Synthesis-oriented
AI Analysis

This reveals a gap in AI-assisted code review for secure software development, showing it remains incremental compared to dedicated security tools.

This study evaluated GitHub Copilot's code review feature for detecting security vulnerabilities and found it frequently failed to identify critical flaws like SQL injection and XSS, instead focusing on low-severity issues like coding style errors.

As software development practices increasingly adopt AI-powered tools, ensuring that such tools can support secure coding has become critical. This study evaluates the effectiveness of GitHub Copilot's recently introduced code review feature in detecting security vulnerabilities. Using a curated set of labeled vulnerable code samples drawn from diverse open-source projects spanning multiple programming languages and application domains, we systematically assessed Copilot's ability to identify and provide feedback on common security flaws. Contrary to expectations, our results reveal that Copilot's code review frequently fails to detect critical vulnerabilities such as SQL injection, cross-site scripting (XSS), and insecure deserialization. Instead, its feedback primarily addresses low-severity issues, such as coding style and typographical errors. These findings expose a significant gap between the perceived capabilities of AI-assisted code review and its actual effectiveness in supporting secure development practices. Our results highlight the continued necessity of dedicated security tools and manual code audits to ensure robust software security.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes