SY LGAug 31, 2022

A stabilizing reinforcement learning approach for sampled systems with partially unknown models

Lukas Beckenbach, Pavel Osinenko, Stefan Streif

arXiv:2208.14714v11.21 citationsh-index: 21

Originality Incremental advance

AI Analysis

This work addresses stability issues in online reinforcement learning for control systems, offering a hybrid approach that combines reinforcement learning with classical adaptive control techniques, which is incremental as it builds on existing methods to enhance reliability.

The authors tackled the problem of ensuring closed-loop stability in online reinforcement learning for sampled systems with partially unknown models, achieving practical stability and significantly reducing cost in adaptive traction and cruise control applications.

Reinforcement learning is commonly associated with training of reward-maximizing (or cost-minimizing) agents, in other words, controllers. It can be applied in model-free or model-based fashion, using a priori or online collected system data to train involved parametric architectures. In general, online reinforcement learning does not guarantee closed loop stability unless special measures are taken, for instance, through learning constraints or tailored training rules. Particularly promising are hybrids of reinforcement learning with "classical" control approaches. In this work, we suggest a method to guarantee practical stability of the system-controller closed loop in a purely online learning setting, i.e., without offline training. Moreover, we assume only partial knowledge of the system model. To achieve the claimed results, we employ techniques of classical adaptive control. The implementation of the overall control scheme is provided explicitly in a digital, sampled setting. That is, the controller receives the state of the system and computes the control action at discrete, specifically, equidistant moments in time. The method is tested in adaptive traction control and cruise control where it proved to significantly reduce the cost.

View on arXiv PDF

Similar