Efficient Model-Free Reinforcement Learning Using Gaussian Process
This work addresses efficiency challenges in reinforcement learning for continuous domains, offering a method that integrates demonstrations to enhance exploration, though it appears incremental as it builds on existing posterior sampling and Gaussian process techniques.
The paper tackles the problem of efficient reinforcement learning in continuous state spaces by proposing the Gaussian Process Posterior Sampling Reinforcement Learning (GPPSTD) algorithm, which combines demonstrations and exploration to reduce uncertainty and improve performance, with theoretical justifications and empirical results provided.
Efficient Reinforcement Learning usually takes advantage of demonstration or good exploration strategy. By applying posterior sampling in model-free RL under the hypothesis of GP, we propose Gaussian Process Posterior Sampling Reinforcement Learning(GPPSTD) algorithm in continuous state space, giving theoretical justifications and empirical results. We also provide theoretical and empirical results that various demonstration could lower expected uncertainty and benefit posterior sampling exploration. In this way, we combined the demonstration and exploration process together to achieve a more efficient reinforcement learning.