LGMLMay 19, 2020

Riemannian Proximal Policy Optimization

arXiv:2005.09195v14 citations
Originality Incremental advance
AI Analysis

This work addresses policy optimization in reinforcement learning for researchers, offering a novel approach with theoretical guarantees, though it appears incremental as it builds on existing proximal and Riemannian methods.

The paper tackled the problem of solving Markov decision processes by proposing a Riemannian proximal optimization algorithm that models policy functions with Gaussian mixture models in a Riemannian space, achieving guaranteed convergence and demonstrating efficacy in preliminary experiments.

In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and formulate it as a nonconvex optimization problem in the Riemannian space of positive semidefinite matrices. For two given policy functions, we also provide its lower bound on policy improvement by using bounds derived from the Wasserstein distance of GMMs. Preliminary experiments show the efficacy of our proposed Riemannian proximal policy optimization algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes