Rates of Convergence in Certain Native Spaces of Approximations used in Reinforcement Learning
This work provides incremental theoretical improvements for researchers in reinforcement learning by refining classical convergence results in specific approximation spaces.
The paper tackles the problem of deriving convergence rates for value function approximations in reinforcement learning within reproducing kernel Hilbert spaces, resulting in explicit geometric upper bounds on error for both value functions and controllers.
This paper studies convergence rates for some value function approximations that arise in a collection of reproducing kernel Hilbert spaces (RKHS) $H(Ω)$. By casting an optimal control problem in a specific class of native spaces, strong rates of convergence are derived for the operator equation that enables offline approximations that appear in policy iteration. Explicit upper bounds on error in value function and controller approximations are derived in terms of power function $\mathcal{P}_{H,N}$ for the space of finite dimensional approximants $H_N$ in the native space $H(Ω)$. These bounds are geometric in nature and refine some well-known, now classical results concerning convergence of approximations of value functions.