SDLGJun 1, 2016

Automatic tagging using deep convolutional neural networks

arXiv:1606.00298v1348 citations
Originality Incremental advance
AI Analysis

This work addresses content-based music tagging for applications in music recommendation and organization, but it is incremental as it builds on existing deep learning methods for audio processing.

The authors tackled automatic music tagging by evaluating fully convolutional neural networks (FCNs) with different architectures and input types, achieving state-of-the-art performance with a 4-layer model on the MagnaTagATune dataset and showing that deeper models perform better on larger datasets like the Million Song Dataset.

We present a content-based automatic music tagging algorithm using fully convolutional neural networks (FCNs). We evaluate different architectures consisting of 2D convolutional layers and subsampling layers only. In the experiments, we measure the AUC-ROC scores of the architectures with different complexities and input types using the MagnaTagATune dataset, where a 4-layer architecture shows state-of-the-art performance with mel-spectrogram input. Furthermore, we evaluated the performances of the architectures with varying the number of layers on a larger dataset (Million Song Dataset), and found that deeper models outperformed the 4-layer architecture. The experiments show that mel-spectrogram is an effective time-frequency representation for automatic tagging and that more complex models benefit from more training data.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes