SDAICVLGASSep 25, 2024

The Effect of Perceptual Metrics on Music Representation Learning for Genre Classification

arXiv:2409.17069v1h-index: 27
Originality Synthesis-oriented
AI Analysis

This work addresses music understanding tasks for researchers and practitioners, but it is incremental as it builds on existing perceptual metric applications.

The study tackled the problem of improving music genre classification by using perceptual metrics as loss functions in autoencoders for representation learning, resulting in better performance over directly using these metrics as distances in classifiers.

The subjective quality of natural signals can be approximated with objective perceptual metrics. Designed to approximate the perceptual behaviour of human observers, perceptual metrics often reflect structures found in natural signals and neurological pathways. Models trained with perceptual metrics as loss functions can capture perceptually meaningful features from the structures held within these metrics. We demonstrate that using features extracted from autoencoders trained with perceptual losses can improve performance on music understanding tasks, i.e. genre classification, over using these metrics directly as distances when learning a classifier. This result suggests improved generalisation to novel signals when using perceptual metrics as loss functions for representation learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes