ROLGMar 2, 2022

Model-free Neural Lyapunov Control for Safe Robot Navigation

arXiv:2203.01190v117 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses safety assurance for robot navigation using DRL, which is an incremental improvement over existing methods by enhancing safety guarantees while maintaining scalability.

The paper tackles the problem of ensuring safety in model-free deep reinforcement learning (DRL) controllers for robot navigation by explicitly co-learning a Twin Neural Lyapunov Function (TNLF) with the control policy and using it as a runtime monitor to guide collision-free trajectories. The result shows effectiveness compared to DRL with augmented rewards and constrained DRL methods across high-dimensional safety-sensitive tasks.

Model-free Deep Reinforcement Learning (DRL) controllers have demonstrated promising results on various challenging non-linear control tasks. While a model-free DRL algorithm can solve unknown dynamics and high-dimensional problems, it lacks safety assurance. Although safety constraints can be encoded as part of a reward function, there still exists a large gap between an RL controller trained with this modified reward and a safe controller. In contrast, instead of implicitly encoding safety constraints with rewards, we explicitly co-learn a Twin Neural Lyapunov Function (TNLF) with the control policy in the DRL training loop and use the learned TNLF to build a runtime monitor. Combined with the path generated from a planner, the monitor chooses appropriate waypoints that guide the learned controller to provide collision-free control trajectories. Our approach inherits the scalability advantages from DRL while enhancing safety guarantees. Our experimental evaluation demonstrates the effectiveness of our approach compared to DRL with augmented rewards and constrained DRL methods over a range of high-dimensional safety-sensitive navigation tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes