The Effects of Noisy Labels on Deep Convolutional Neural Networks for Music Tagging
This work addresses the problem of noisy labels in music tagging datasets for researchers and practitioners, but it is incremental as it focuses on analysis and validation rather than introducing new methods.
The study investigated the impact of noisy labels on deep convolutional neural networks for music tagging, showing that networks can remain effective despite high error rates in groundtruth datasets and that label noise may explain performance variations across tags.
Deep neural networks (DNN) have been successfully applied to music classification including music tagging. However, there are several open questions regarding the training, evaluation, and analysis of DNNs. In this article, we investigate specific aspects of neural networks, the effects of noisy labels, to deepen our understanding of their properties. We analyse and (re-)validate a large music tagging dataset to investigate the reliability of training and evaluation. Using a trained network, we compute label vector similarities which is compared to groundtruth similarity. The results highlight several important aspects of music tagging and neural networks. We show that networks can be effective despite relatively large error rates in groundtruth datasets, while conjecturing that label noise can be the cause of varying tag-wise performance differences. Lastly, the analysis of our trained network provides valuable insight into the relationships between music tags. These results highlight the benefit of using data-driven methods to address automatic music tagging.