LG SY MLMay 14, 2019

Control Regularization for Reduced Variance Reinforcement Learning

Richard Cheng, Abhinav Verma, Gabor Orosz, Swarat Chaudhuri, Yisong Yue, Joel W. Burdick

arXiv:1905.05380v120.887 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the issue of unreliable performance in reinforcement learning for continuous control tasks, offering a method to improve stability and efficiency, though it is incremental as it builds on existing regularization techniques.

The paper tackles the problem of high variance in model-free reinforcement learning for continuous control by proposing a functional regularization approach that regularizes the policy to be similar to a prior, resulting in significantly reduced variance, guaranteed dynamic stability, and more efficient learning.

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.

View on arXiv PDF Code

Similar