Gaussian-Mixture-Model Q-Functions for Reinforcement Learning by Riemannian Optimization
This work addresses the challenge of improving policy evaluation in reinforcement learning for researchers and practitioners, offering a novel method that leverages Riemannian optimization, though it may be incremental in its application of existing tools to a new context.
The paper tackles the problem of approximating Q-function losses in reinforcement learning by introducing Gaussian-mixture models as functional approximators, called GMM-QFs, and demonstrates that this approach outperforms state-of-the-art methods, including deep Q-networks, on benchmark tasks without using experienced data.
This paper establishes a novel role for Gaussian-mixture models (GMMs) as functional approximators of Q-function losses in reinforcement learning (RL). Unlike the existing RL literature, where GMMs play their typical role as estimates of probability density functions, GMMs approximate here Q-function losses. The new Q-function approximators, coined GMM-QFs, are incorporated in Bellman residuals to promote a Riemannian-optimization task as a novel policy-evaluation step in standard policy-iteration schemes. The paper demonstrates how the hyperparameters (means and covariance matrices) of the Gaussian kernels are learned from the data, opening thus the door of RL to the powerful toolbox of Riemannian optimization. Numerical tests show that with no use of experienced data, the proposed design outperforms state-of-the-art methods, even deep Q-networks which use experienced data, on benchmark RL tasks.