SEAICYJul 11, 2025

$\texttt{Droid}$: A Resource Suite for AI-Generated Code Detection

arXiv:2507.10583v38 citationsh-index: 47EMNLP
Originality Incremental advance
AI Analysis

This provides a resource for researchers and practitioners to develop more reliable AI-generated code detectors, addressing issues in academic integrity and software security, though it is incremental in building on existing detection frameworks.

The authors tackled the problem of detecting AI-generated code by creating DroidCollection, a large open dataset with over a million code samples across seven languages and 43 models, and DroidDetect, a detector suite that shows existing methods fail to generalize and can be compromised, but training on adversarial data improves robustness.

In this work, we compile $\textbf{$\texttt{DroidCollection}$}$, the most extensive open data suite for training and evaluating machine-generated code detectors, comprising over a million code samples, seven programming languages, outputs from 43 coding models, and over three real-world coding domains. Alongside fully AI-generated samples, our collection includes human-AI co-authored code, as well as adversarial samples explicitly crafted to evade detection. Subsequently, we develop $\textbf{$\texttt{DroidDetect}$}$, a suite of encoder-only detectors trained using a multi-task objective over $\texttt{DroidCollection}$. Our experiments show that existing detectors' performance fails to generalise to diverse coding domains and programming languages outside of their narrow training data. Additionally, we demonstrate that while most detectors are easily compromised by humanising the output distributions using superficial prompting and alignment approaches, this problem can be easily amended by training on a small amount of adversarial data. Finally, we demonstrate the effectiveness of metric learning and uncertainty-based resampling as means to enhance detector training on possibly noisy distributions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes