MusicDET: Zero-Shot AI-Generated Music Detection
For music authenticity preservation, this work addresses the practical limitation of existing detectors that fail on unseen generators, offering a more robust solution.
MusicDET tackles zero-shot AI-generated music detection by training only on real music, achieving superior performance over discriminative detectors on unseen generators, with consistent gains on FakeMusicCaps and SONICS datasets.
Detecting AI-generated music is crucial for preserving artistic authenticity and preventing the misuse of generative music technologies. However, existing discriminative detectors typically rely on generated samples during training and often suffer from severe performance degradation when confronted with music produced by unseen generators, which limits their real-world applicability. To address this issue, we formulate a zero-shot setting for AI-generated music detection, where the detector is trained exclusively on real music without access to any generated samples. Under this setting, we propose MusicDET, a generator-agnostic detection framework based on frequency-guided normalizing flows that probabilistically models the distribution of real music features. By evaluating the likelihood of an input sample under the learned real-music distribution, MusicDET enables effective detection of out-of-distribution music signals. Experiments on the FakeMusicCaps and SONICS datasets show that MusicDET consistently outperforms conventional discriminative detectors, particularly when detecting music generated by previously unseen models.