SYLGJun 16, 2022

Barrier Certified Safety Learning Control: When Sum-of-Square Programming Meets Reinforcement Learning

arXiv:2206.07915v24 citationsh-index: 53
Originality Incremental advance
AI Analysis

This work addresses safety-critical engineering implementations by providing a method to completely guarantee safety in reinforcement learning, which is an incremental improvement over existing approaches.

The paper tackled the problem of ensuring complete safety in reinforcement learning control by integrating control barrier functions and sum-of-squares programming to guarantee actions remain within safe regions, demonstrating effectiveness on an inverted pendulum model with superiority over quadratic programming-based methods.

Safety guarantee is essential in many engineering implementations. Reinforcement learning provides a useful way to strengthen safety. However, reinforcement learning algorithms cannot completely guarantee safety over realistic operations. To address this issue, this work adopts control barrier functions over reinforcement learning, and proposes a compensated algorithm to completely maintain safety. Specifically, a sum-of-squares programming has been exploited to search for the optimal controller, and tune the learning hyperparameters simultaneously. Thus, the control actions are pledged to be always within the safe region. The effectiveness of proposed method is demonstrated via an inverted pendulum model. Compared to quadratic programming based reinforcement learning methods, our sum-of-squares programming based reinforcement learning has shown its superiority.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes