IR LGJul 24, 2025

Semantic IDs for Music Recommendation

M. Jeffrey Mei, Florian Henkel, Samuel E. Sandberg, Oliver Bembom, Andreas F. Ehmann

arXiv:2507.18800v111.87 citationsh-index: 2RecSys

Originality Incremental advance

AI Analysis

This addresses the memory efficiency issue for music streaming services, but it is incremental as it builds on existing shared embedding methods.

The paper tackled the problem of reducing memory usage in recommender systems by using shared content-based features (semantic IDs) instead of unique embeddings for each item, resulting in improved recommendation accuracy and diversity while reducing model size, as demonstrated on two music datasets including an online A/B test.

Training recommender systems for next-item recommendation often requires unique embeddings to be learned for each item, which may take up most of the trainable parameters for a model. Shared embeddings, such as using content information, can reduce the number of distinct embeddings to be stored in memory. This allows for a more lightweight model; correspondingly, model complexity can be increased due to having fewer embeddings to store in memory. We show the benefit of using shared content-based features ('semantic IDs') in improving recommendation accuracy and diversity, while reducing model size, for two music recommendation datasets, including an online A/B test on a music streaming service.

View on arXiv PDF

Similar