3.9LGApr 26
CAPSULE: Control-Theoretic Action Perturbations for Safe Uncertainty-Aware Reinforcement LearningRahul Narava, Siddharth Verma, Ojas Jain et al.
Ensuring safe exploration in high-dimensional systems with unknown dynamics remains a significant challenge. Existing safe reinforcement learning methods often provide safety guarantees only in expectation, which can still lead to safety violations. Control-theoretic approaches, in contrast, offer hard constraint-based safety guarantees but typically assume access to known system dynamics or require accurate estimation of control-affine models. In this paper, we propose a safe reinforcement learning framework that learns a probabilistic control-affine dynamics model in an offline setting. The learned model is leveraged to explicitly construct control barrier functions (CBFs) that incorporate model uncertainty to provide conservative safety constraints. These CBF constraints are enforced through an online constraint-based action correction mechanism, enabling safe exploration without overly restricting task performance. Empirical evaluations on nonlinear, complex continuous-control benchmarks demonstrate that our approach achieves returns comparable to those of existing baselines while significantly reducing safety violations.
SYMar 11, 2025
Balancing SoC in Battery Cells using Safe Action PerturbationsE Harshith Kumar Yadav, Rahul Narava, Anshika et al.
Managing equal charge levels in active cell balancing while charging a Li-ion battery is challenging. An imbalance in charge levels affects the state of health of the battery, along with the concerns of thermal runaway and fire hazards. Traditional methods focus on safety assurance as a trade-off between safety and charging time. Others deal with battery-specific conditions to ensure safety, therefore losing on the generalization of the control strategies over various configurations of batteries. In this work, we propose a method to learn safe battery charging actions by using a safety-layer as an add-on over a Deep Reinforcement Learning (RL) agent. The safety layer perturbs the agent's action to prevent the battery from encountering unsafe or dangerous states. Further, our Deep RL framework focuses on learning a generalized policy that can be effectively employed with varying configurations of batteries. Our experimental results demonstrate that the safety-layer based action perturbation incurs fewer safety violations by avoiding unsafe states along with learning a robust policy for several battery configurations.