LG MLNov 17, 2018

Recursive Sparse Pseudo-input Gaussian Process SARSA

arXiv:1811.07201v10.8

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for reinforcement learning practitioners seeking more efficient Gaussian Process methods.

The paper tackled the memory inefficiency in Sparse Pseudo-input Gaussian Process SARSA (SPGP-SARSA) by deriving recursive formulas for its predictive moments, enabling reuse of previous computations and updating value estimates on multiple timescales.

The class of Gaussian Process (GP) methods for Temporal Difference learning has shown promise for data-efficient model-free Reinforcement Learning. In this paper, we consider a recent variant of the GP-SARSA algorithm, called Sparse Pseudo-input Gaussian Process SARSA (SPGP-SARSA), and derive recursive formulas for its predictive moments. This extension promotes greater memory efficiency, since previous computations can be reused and, interestingly, it provides a technique for updating value estimates on a multiple timescales

View on arXiv PDF

Similar