LG MLMay 19, 2020

Riemannian Proximal Policy Optimization

Shijun Wang, Baocheng Zhu, Chen Li, Mingzhe Wu, James Zhang, Wei Chu, Yuan Qi

arXiv:2005.09195v15.04 citations

Originality Incremental advance

AI Analysis

This work addresses policy optimization in reinforcement learning for researchers, offering a novel approach with theoretical guarantees, though it appears incremental as it builds on existing proximal and Riemannian methods.

The paper tackled the problem of solving Markov decision processes by proposing a Riemannian proximal optimization algorithm that models policy functions with Gaussian mixture models in a Riemannian space, achieving guaranteed convergence and demonstrating efficacy in preliminary experiments.

In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and formulate it as a nonconvex optimization problem in the Riemannian space of positive semidefinite matrices. For two given policy functions, we also provide its lower bound on policy improvement by using bounds derived from the Wasserstein distance of GMMs. Preliminary experiments show the efficacy of our proposed Riemannian proximal policy optimization algorithm.

View on arXiv PDF

Similar