ROApr 17

Autonomous Vehicle Collision Avoidance With Racing Parameterized Deep Reinforcement Learning

Shathushan Sivashangaran, Vihaan Dutta, Apoorva Khairnar, Sepideh Gohari, Azim Eskandarian

arXiv:2604.1670222.9h-index: 33

AI Analysis

For autonomous vehicle safety systems, this work offers a computationally efficient DRL-based collision avoidance method that operates at the limits of vehicle dynamics, potentially reducing accidents caused by human error.

The paper presents a parameterized Deep Reinforcement Learning (DRL) collision avoidance policy for autonomous vehicles, inspired by race car overtaking, that outperforms a Model Predictive Control and Artificial Potential Function (MPC-APF) baseline across three intersection collision scenarios, with 31x fewer FLOPS and 64x lower inference latency. The reversed heading variant achieves 30% better performance in head-to-head collisions than the default policy and 50% better than the baseline.

Road traffic accidents are a leading cause of fatalities worldwide. In the US, human error causes 94% of crashes, resulting in excess of 7,000 pedestrian fatalities and $500 billion in costs annually. Autonomous Vehicles (AVs) with emergency collision avoidance systems that operate at the limits of vehicle dynamics at a high frequency, a dual constraint of nonlinear kinodynamic accuracy and computational efficiency, further enhance safety benefits during adverse weather and cybersecurity breaches, and to evade dangerous human driving when AVs and human drivers share roads. This paper parameterizes a Deep Reinforcement Learning (DRL) collision avoidance policy Out-Of-Distribution (OOD) utilizing race car overtaking, without explicit geometric mimicry reference trajectory guidance, in simulation, with a physics-informed, simulator exploit-aware reward to encode nonlinear vehicle kinodynamics. Two policies are evaluated, a default uni-direction and a reversed heading variant that navigates in the opposite direction to other cars, which both consistently outperform a Model Predictive Control and Artificial Potential Function (MPC-APF) baseline, with zero-shot transfer to proportionally scaled hardware, across three intersection collision scenarios, at 31x fewer Floating Point Operations (FLOPS) and 64x lower inference latency. The reversed heading policy outperforms the default racing overtaking policy in head-to-head collisions by 30% and the baseline by 50%, and matches the former in side collisions, where both DRL policies evade 10% greater than numerical optimal control.

View on arXiv PDF

Similar