Representation Learning in Low-rank Slate-based Recommender Systems
This work addresses efficiency issues in RL for recommender systems, which is incremental as it builds on existing slate-based setups with a focus on low-rank structures.
The paper tackles the challenge of large state and action spaces in reinforcement learning for recommender systems by proposing a sample-efficient representation learning algorithm based on low-rank Markov decision processes, and constructs a simulation environment to test it.
Reinforcement learning (RL) in recommendation systems offers the potential to optimize recommendations for long-term user engagement. However, the environment often involves large state and action spaces, which makes it hard to efficiently learn and explore. In this work, we propose a sample-efficient representation learning algorithm, using the standard slate recommendation setup, to treat this as an online RL problem with low-rank Markov decision processes (MDPs). We also construct the recommender simulation environment with the proposed setup and sampling method.