OCLGJan 9, 2020

Regularity and stability of feedback relaxed controls

arXiv:2001.03148v226 citations
AI Analysis

This work provides theoretical justification for reinforcement learning heuristics that use exploration rewards to enhance robustness in decision-making, addressing stability issues in control problems.

The paper tackles the design of robust feedback controls for multi-dimensional continuous-time stochastic exit time problems by introducing a relaxed control regularization with exploration rewards, establishing Hölder continuous feedback controls and Lipschitz stability of value functions and controls under parameter perturbations.

This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls for multi-dimensional continuous-time stochastic exit time problems. We establish that the regularized control problem admits a Hölder continuous feedback control, and demonstrate that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations. Moreover, we show that a pre-computed feedback relaxed control has a robust performance in a perturbed system, and derive a first-order sensitivity equation for both the value function and optimal feedback relaxed control. These stability results provide a theoretical justification for recent reinforcement learning heuristics that including an exploration reward in the optimization objective leads to more robust decision making. We finally prove first-order monotone convergence of the value functions for relaxed control problems with vanishing exploration parameters, which subsequently enables us to construct the pure exploitation strategy of the original control problem based on the feedback relaxed controls.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes