Calibrated Recommendations with Contextual Bandits
This addresses the challenge of personalized content recommendation for users on platforms like Spotify, where user preferences vary by context, and is incremental as it builds on existing calibration methods by incorporating contextual adaptation.
The paper tackled the problem of delivering a balanced and personalized content mix on Spotify's Home page, where historical data is skewed toward music, by proposing a calibration method using contextual bandits to learn optimal content type distributions based on user context and preferences, resulting in improved precision and user engagement, especially for underrepresented content types like podcasts.
Spotify's Home page features a variety of content types, including music, podcasts, and audiobooks. However, historical data is heavily skewed toward music, making it challenging to deliver a balanced and personalized content mix. Moreover, users' preference towards different content types may vary depending on the time of day, the day of week, or even the device they use. We propose a calibration method that leverages contextual bandits to dynamically learn each user's optimal content type distribution based on their context and preferences. Unlike traditional calibration methods that rely on historical averages, our approach boosts engagement by adapting to how users interests in different content types varies across contexts. Both offline and online results demonstrate improved precision and user engagement with the Spotify Home page, in particular with under-represented content types such as podcasts.