SDLGASOct 7, 2020

Less is more: Faster and better music version identification with embedding distillation

arXiv:2010.03284v12 citations
Originality Incremental advance
AI Analysis

This incremental improvement makes real-world music retrieval systems more practical on standalone devices for users in music identification and analysis.

The paper tackled the challenge of balancing accuracy and scalability in music version identification by using embedding distillation to reduce dimensionality, achieving 99% smaller embeddings and up to a 3% accuracy increase.

Version identification systems aim to detect different renditions of the same underlying musical composition (loosely called cover songs). By learning to encode entire recordings into plain vector embeddings, recent systems have made significant progress in bridging the gap between accuracy and scalability, which has been a key challenge for nearly two decades. In this work, we propose to further narrow this gap by employing a set of data distillation techniques that reduce the embedding dimensionality of a pre-trained state-of-the-art model. We compare a wide range of techniques and propose new ones, from classical dimensionality reduction to more sophisticated distillation schemes. With those, we obtain 99% smaller embeddings that, moreover, yield up to a 3% accuracy increase. Such small embeddings can have an important impact in retrieval time, up to the point of making a real-world system practical on a standalone laptop.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes