SDAIASSep 10, 2025

Segment Transformer: AI-Generated Music Detection via Music Structural Analysis

arXiv:2509.08283v13 citationsh-index: 1APSIPA
Originality Incremental advance
AI Analysis

This addresses copyright and authorship concerns in music information retrieval by improving detection accuracy, though it is incremental as it builds on existing transformer and feature extraction methods.

The paper tackled the problem of detecting AI-generated music by analyzing structural patterns in music segments, achieving high accuracy on the FakeMusicCaps and SONICS datasets.

Audio and music generation systems have been remarkably developed in the music information retrieval (MIR) research field. The advancement of these technologies raises copyright concerns, as ownership and authorship of AI-generated music (AIGM) remain unclear. Also, it can be difficult to determine whether a piece was generated by AI or composed by humans clearly. To address these challenges, we aim to improve the accuracy of AIGM detection by analyzing the structural patterns of music segments. Specifically, to extract musical features from short audio clips, we integrated various pre-trained models, including self-supervised learning (SSL) models or an audio effect encoder, each within our suggested transformer-based framework. Furthermore, for long audio, we developed a segment transformer that divides music into segments and learns inter-segment relationships. We used the FakeMusicCaps and SONICS datasets, achieving high accuracy in both the short-audio and full-audio detection experiments. These findings suggest that integrating segment-level musical features into long-range temporal analysis can effectively enhance both the performance and robustness of AIGM detection systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes