SDCVASMar 8, 2024

Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings

arXiv:2403.05380v14 citationsh-index: 9WIFS
Originality Incremental advance
AI Analysis

This work addresses the need for detecting Auto-Tuned vocals to support music scholars, producers, and listeners in analyzing authenticity, but it appears incremental as it builds on existing audio forensic techniques.

This study tackled the problem of detecting Auto-Tuned vocals in music recordings by introducing a data-driven approach using triplet networks and a new dataset. The method achieved superior accuracy and robustness compared to the Rawnet2 baseline, though specific numerical results were not provided in the abstract.

In the domain of music production and audio processing, the implementation of automatic pitch correction of the singing voice, also known as Auto-Tune, has significantly transformed the landscape of vocal performance. While auto-tuning technology has offered musicians the ability to tune their vocal pitches and achieve a desired level of precision, its use has also sparked debates regarding its impact on authenticity and artistic integrity. As a result, detecting and analyzing Auto-Tuned vocals in music recordings has become essential for music scholars, producers, and listeners. However, to the best of our knowledge, no prior effort has been made in this direction. This study introduces a data-driven approach leveraging triplet networks for the detection of Auto-Tuned songs, backed by the creation of a dataset composed of original and Auto-Tuned audio clips. The experimental results demonstrate the superiority of the proposed method in both accuracy and robustness compared to Rawnet2, an end-to-end model proposed for anti-spoofing and widely used for other audio forensic tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes