SDLGASOct 10, 2021

Multi-task Learning with Metadata for Music Mood Classification

arXiv:2110.04765v1
Originality Incremental advance
AI Analysis

This work addresses mood classification for music informatics, with applications in music discovery and recommendation, and is incremental as it builds on existing state-of-the-art convolutional neural networks.

The paper tackled the problem of improving mood classification in music by leveraging readily available audio metadata like artist and year through a multi-task learning approach, resulting in performance improvements of up to 8.7 points in average precision on multiple datasets.

Mood recognition is an important problem in music informatics and has key applications in music discovery and recommendation. These applications have become even more relevant with the rise of music streaming. Our work investigates the research question of whether we can leverage audio metadata such as artist and year, which is readily available, to improve the performance of mood classification models. To this end, we propose a multi-task learning approach in which a shared model is simultaneously trained for mood and metadata prediction tasks with the goal to learn richer representations. Experimentally, we demonstrate that applying our technique on the existing state-of-the-art convolutional neural networks for mood classification improves their performances consistently. We conduct experiments on multiple datasets and report that our approach can lead to improvements in the average precision metric by up to 8.7 points.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes