Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control
This addresses the problem of domain transfer for autonomous vehicles, offering a more interpretable and robust approach, though it appears incremental as it builds on existing robust control and RL methods.
The paper tackles the modeling gap between training and deployment domains in deep reinforcement learning for autonomous driving by proposing a robust-control-based transfer architecture (RL-RC) that transfers kinematic quantities and uses a disturbance observer for tracking control, achieving zero-shot transfer across scenarios like lane keeping and obstacle avoidance in simulations.
Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.