Stabilizing reinforcement learning control: A modular framework for optimizing over all stable behavior
This addresses stability issues in reinforcement learning control for applications like robotics or autonomous systems, though it appears incremental by integrating existing parameterization methods with RL.
The paper tackles the problem of designing feedback controllers by combining deep reinforcement learning with stability guarantees using the Youla-Kucera parameterization, resulting in a framework that enables optimization over all stable behaviors with data-driven models and extensions to nonlinear operators.
We propose a framework for the design of feedback controllers that combines the optimization-driven and model-free advantages of deep reinforcement learning with the stability guarantees provided by using the Youla-Kucera parameterization to define the search domain. Recent advances in behavioral systems allow us to construct a data-driven internal model; this enables an alternative realization of the Youla-Kucera parameterization based entirely on input-output exploration data. Perhaps of independent interest, we formulate and analyze the stability of such data-driven models in the presence of noise. The Youla-Kucera approach requires a stable "parameter" for controller design. For the training of reinforcement learning agents, the set of all stable linear operators is given explicitly through a matrix factorization approach. Moreover, a nonlinear extension is given using a neural network to express a parameterized set of stable operators, which enables seamless integration with standard deep learning libraries. Finally, we show how these ideas can also be applied to tune fixed-structure controllers.