LGSYMLMay 14, 2019

Control Regularization for Reduced Variance Reinforcement Learning

arXiv:1905.05380v187 citations
Originality Incremental advance
AI Analysis

This addresses the issue of unreliable performance in reinforcement learning for continuous control tasks, offering a method to improve stability and efficiency, though it is incremental as it builds on existing regularization techniques.

The paper tackles the problem of high variance in model-free reinforcement learning for continuous control by proposing a functional regularization approach that regularizes the policy to be similar to a prior, resulting in significantly reduced variance, guaranteed dynamic stability, and more efficient learning.

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes