SDLGASFeb 27, 2018

Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification

arXiv:1802.09697v143 citations
AI Analysis

This work addresses the problem of automating music genre classification for applications in music analysis, showing a significant improvement over previous methods but is incremental in its approach.

The paper tackled music genre classification by proposing a method that combines human perception knowledge and auditory neurophysiology, achieving human-level accuracy of 70% on a 10-genre task.

Music genre classification is one example of content-based analysis of music signals. Traditionally, human-engineered features were used to automatize this task and 61% accuracy has been achieved in the 10-genre classification. However, it's still below the 70% accuracy that humans could achieve in the same task. Here, we propose a new method that combines knowledge of human perception study in music genre classification and the neurophysiology of the auditory system. The method works by training a simple convolutional neural network (CNN) to classify a short segment of the music signal. Then, the genre of a music is determined by splitting it into short segments and then combining CNN's predictions from all short segments. After training, this method achieves human-level (70%) accuracy and the filters learned in the CNN resemble the spectrotemporal receptive field (STRF) in the auditory system.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes