SD ASJun 7

Probing Token Spaces under Generator Shift in AI-Generated Music Detection

Joonyong Park, Jungwoo Kim, Junyoung Koh, Yuki Saito

arXiv:2606.08663v19.7Has Code

Predicted impact top 9% in SD · last 90 daysOriginality Incremental advance

AI Analysis

For researchers building robust AI-generated music detectors, this work identifies codec-style token spaces as a critical experimental axis under generator shift.

The paper studies AI-generated music detection under generator shift, finding that standard benchmarks are nearly saturated while fake-source restriction reveals large differences between token spaces: X-Codec tokens excel with Udio training, MERT tokens with Suno-v3.5 training.

AI-generated music detectors can appear robust on standard benchmark splits, yet their deployments require transfer to generator sources absent during training. We study this problem with source-restricted evaluation on \textsc{MoM-open}, an open reconstruction of MoM-CLAM that replaces the non-redistributable real corpus with FMA and MTG-Jamendo while preserving the fake-generator protocol. To isolate the role of representation, we introduce \textsc{CoMoE}, a compact fixed classifier for comparing heterogeneous audio token spaces while keeping the downstream architecture and training recipe unchanged. Experiments show that standard and real-source-restricted splits are nearly saturated, whereas fake-source restriction exposes large differences between token spaces: X-Codec tokens are strongest when training on Udio alone, while MERT-derived tokens are stronger when training on Suno-v3.5 alone. These results suggest that codec-style discrete token spaces should be treated as a primary experimental axis under generator shift in AI-generated music detection. Our code and data are available at https://github.com/MAAP-LAB/CoMoE.

View on arXiv PDF Code

Similar