Stabilizing Dynamical Systems via Policy Gradient Methods
This addresses a fundamental limitation in control systems engineering by enabling stabilization without prior knowledge, though it is incremental as it builds on existing policy gradient methods.
The paper tackles the problem of stabilizing unknown control systems without a pre-existing stabilizing controller by introducing a model-free algorithm that solves a series of discounted LQR problems with increasing discount factors. It proves efficient recovery of stabilizing controllers for linear and nonlinear systems near equilibrium and demonstrates effectiveness on benchmarks.
Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering. In this paper, we provide a simple, model-free algorithm for stabilizing fully observed dynamical systems. While model-free methods have become increasingly popular in practice due to their simplicity and flexibility, stabilization via direct policy search has received surprisingly little attention. Our algorithm proceeds by solving a series of discounted LQR problems, where the discount factor is gradually increased. We prove that this method efficiently recovers a stabilizing controller for linear systems, and for smooth, nonlinear systems within a neighborhood of their equilibria. Our approach overcomes a significant limitation of prior work, namely the need for a pre-given stabilizing control policy. We empirically evaluate the effectiveness of our approach on common control benchmarks.