Bandit Learning for Diversified Interactive Recommendation
This work addresses the issue of user dissatisfaction due to lack of diversity in recommendations, offering a novel method that is incremental in combining existing techniques for a specific domain.
The paper tackles the problem of low diversity in interactive recommender systems by proposing DC^2B, a model that uses determinantal point processes and Thompson sampling to improve diversity while maintaining accuracy, achieving significant gains in diversity metrics on real datasets.
Interactive recommender systems that enable the interactions between users and the recommender system have attracted increasing research attentions. Previous methods mainly focus on optimizing recommendation accuracy. However, they usually ignore the diversity of the recommendation results, thus usually results in unsatisfying user experiences. In this paper, we propose a novel diversified recommendation model, named Diversified Contextual Combinatorial Bandit (DC$^2$B), for interactive recommendation with users' implicit feedback. Specifically, DC$^2$B employs determinantal point process in the recommendation procedure to promote diversity of the recommendation results. To learn the model parameters, a Thompson sampling-type algorithm based on variational Bayesian inference is proposed. In addition, theoretical regret analysis is also provided to guarantee the performance of DC$^2$B. Extensive experiments on real datasets are performed to demonstrate the effectiveness of the proposed method.