SY AIMar 26, 2024

Reinforcement Learning-based Receding Horizon Control using Adaptive Control Barrier Functions for Safety-Critical Systems

Ehsan Sabouni, H. M. Sabbir Ahmad, Vittorio Giammarino, Christos G. Cassandras, Ioannis Ch. Paschalidis, Wenchao Li

arXiv:2403.17338v37.315 citationsh-index: 41Has CodeCDC

Originality Incremental advance

AI Analysis

This work addresses parameter tuning for safety-critical control systems, particularly in automated vehicles, but is incremental as it combines existing RL and MPC-CBF methods.

The paper tackles the challenge of tuning parameters in Control Barrier Function-based controllers for safety-critical systems, which impacts performance and feasibility, by proposing a Reinforcement Learning-based Receding Horizon Control approach that learns optimal parameters through bilevel optimization. Results show improved performance and a significant reduction in infeasible cases compared to traditional heuristic tuning methods in automated merging control for Connected and Automated Vehicles.

Optimal control methods provide solutions to safety-critical problems but easily become intractable. Control Barrier Functions (CBFs) have emerged as a popular technique that facilitates their solution by provably guaranteeing safety, through their forward invariance property, at the expense of some performance loss. This approach involves defining a performance objective alongside CBF-based safety constraints that must always be enforced. Unfortunately, both performance and solution feasibility can be significantly impacted by two key factors: (i) the selection of the cost function and associated parameters, and (ii) the calibration of parameters within the CBF-based constraints, which capture the trade-off between performance and conservativeness. %as well as infeasibility. To address these challenges, we propose a Reinforcement Learning (RL)-based Receding Horizon Control (RHC) approach leveraging Model Predictive Control (MPC) with CBFs (MPC-CBF). In particular, we parameterize our controller and use bilevel optimization, where RL is used to learn the optimal parameters while MPC computes the optimal control input. We validate our method by applying it to the challenging automated merging control problem for Connected and Automated Vehicles (CAVs) at conflicting roadways. Results demonstrate improved performance and a significant reduction in the number of infeasible cases compared to traditional heuristic approaches used for tuning CBF-based controllers, showcasing the effectiveness of the proposed method.

View on arXiv PDF Code

Similar