SDIRMMASNov 24, 2020

A Novel Multimodal Music Genre Classifier using Hierarchical Attention and Convolutional Neural Network

arXiv:2011.11970v1
AI Analysis

This paper tackles the problem of music genre classification for music information retrieval, offering an incremental approach by combining existing methods.

This paper addresses multimodal music genre classification by combining audio and lyrical content. It uses a CNN for spectrogram feature extraction and a Hierarchical Attention Network for lyrics, then classifies music tracks based on the fused feature vector.

Music genre classification is one of the trending topics in regards to the current Music Information Retrieval (MIR) Research. Since, the dependency of genre is not only limited to the audio profile, we also make use of textual content provided as lyrics of the corresponding song. We implemented a CNN based feature extractor for spectrograms in order to incorporate the acoustic features and a Hierarchical Attention Network based feature extractor for lyrics. We then go on to classify the music track based upon the resulting fused feature vector.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes