Cover Tree Bayesian Reinforcement Learning
This work addresses reinforcement learning challenges in unknown environments, particularly for continuous state spaces, but appears incremental as it builds on existing tree-based and Bayesian methods.
The paper tackles the problem of reinforcement learning in continuous state spaces by proposing an online tree-based Bayesian approach, achieving effective exploration policies through Thompson sampling and approximate dynamic programming, with experimental comparison showing its suitability against least squares policy iteration.
This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration policies in unknown environments. The flexibility and computational simplicity of the model render it suitable for many reinforcement learning problems in continuous state spaces. We demonstrate this in an experimental comparison with least squares policy iteration.