IR AI LG SD MLOct 3, 2018

Disambiguating Music Artists at Scale with Audio Metric Learning

Jimena Royo-Letelier, Romain Hennequin, Viet-Anh Tran, Manuel Moussallam

arXiv:1810.01807v18.58 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of organizing and clustering music artists in large-scale databases, which is an incremental improvement for music streaming and recommendation systems.

The paper tackles the problem of disambiguating music artists in large catalogs by learning artist embeddings directly from audio using metric learning. It shows that their system outperforms a classifier-based approach when sufficient audio data is available, and proposes a new negative sampling method using side information like genre.

We address the problem of disambiguating large scale catalogs through the definition of an unknown artist clustering task. We explore the use of metric learning techniques to learn artist embeddings directly from audio, and using a dedicated homonym artists dataset, we compare our method with a recent approach that learn similar embeddings using artist classifiers. While both systems have the ability to disambiguate unknown artists relying exclusively on audio, we show that our system is more suitable in the case when enough audio data is available for each artist in the train dataset. We also propose a new negative sampling method for metric learning that takes advantage of side information such as music genre during the learning phase and shows promising results for the artist clustering task.

View on arXiv PDF Code

Similar