IRJul 4, 2020

Neural Interactive Collaborative Filtering

Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, Dawei Yin

arXiv:2007.02095v118.7127 citationsHas Code

Originality Highly original

AI Analysis

This addresses the challenge of making effective recommendations in interactive settings where user profiles are not well-established, representing an incremental improvement over existing approaches.

The paper tackles the problem of interactive collaborative filtering for cold-start or taste-drifting users by proposing a neural network-based exploration policy trained with Q-learning to balance learning user profiles and making accurate recommendations, achieving superior performance over state-of-the-art methods on three benchmark datasets.

In this paper, we study collaborative filtering in an interactive setting, in which the recommender agents iterate between making recommendations and updating the user profile based on the interactive feedback. The most challenging problem in this scenario is how to suggest items when the user profile has not been well established, i.e., recommend for cold-start users or warm-start users with taste drifting. Existing approaches either rely on overly pessimistic linear exploration strategy or adopt meta-learning based algorithms in a full exploitation way. In this work, to quickly catch up with the user's interests, we propose to represent the exploration policy with a neural network and directly learn it from the feedback data. Specifically, the exploration policy is encoded in the weights of multi-channel stacked self-attention neural networks and trained with efficient Q-learning by maximizing users' overall satisfaction in the recommender systems. The key insight is that the satisfied recommendations triggered by the exploration recommendation can be viewed as the exploration bonus (delayed reward) for its contribution on improving the quality of the user profile. Therefore, the proposed exploration policy, to balance between learning the user profile and making accurate recommendations, can be directly optimized by maximizing users' long-term satisfaction with reinforcement learning. Extensive experiments and analysis conducted on three benchmark collaborative filtering datasets have demonstrated the advantage of our method over state-of-the-art methods.

View on arXiv PDF Code

Similar