Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
This work addresses the challenge of improving collective performance through information sharing in multi-task RL, but it appears incremental as it builds on existing model transfer ideas in a specific tabular episodic setting.
The paper tackles the problem of multi-task reinforcement learning in heterogeneous Markov decision processes by designing an algorithm based on model transfer, and it provides gap-dependent and gap-independent upper and lower bounds to characterize the problem's complexity.
We study multi-task reinforcement learning (RL) in tabular episodic Markov decision processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical MDPs, with a goal of improving their collective performance through inter-player information sharing. We design and analyze an algorithm based on the idea of model transfer, and provide gap-dependent and gap-independent upper and lower bounds that characterize the intrinsic complexity of the problem.