Validity Learning on Failures: Mitigating the Distribution Shift in Autonomous Vehicle Planning
This addresses a critical bottleneck in autonomous driving planning by mitigating distribution shift, offering a novel solution for safer and more reliable vehicle navigation.
The paper tackles the co-variate shift problem in Imitation Learning for autonomous vehicle planning by proposing Validity Learning on Failures (VL(on failure)), which learns from planner failures without expert annotations, resulting in substantial improvements in closed-loop metrics like Score, Progress, and Success Rate, and outperforming state-of-the-art methods on the Bench2Drive benchmark.
The planning problem constitutes a fundamental aspect of the autonomous driving framework. Recent strides in representation learning have empowered vehicles to comprehend their surrounding environments, thereby facilitating the integration of learning-based planning strategies. Among these approaches, Imitation Learning stands out due to its notable training efficiency. However, traditional Imitation Learning methodologies encounter challenges associated with the co-variate shift phenomenon. We propose Validity Learning on Failures, VL(on failure), as a remedy to address this issue. The essence of our method lies in deploying a pre-trained planner across diverse scenarios. Instances where the planner deviates from its immediate objectives, such as maintaining a safe distance from obstacles or adhering to traffic rules, are flagged as failures. The states corresponding to these failures are compiled into a new dataset, termed the failure dataset. Notably, the absence of expert annotations for this data precludes the applicability of standard imitation learning approaches. To facilitate learning from the closed-loop mistakes, we introduce the VL objective which aims to discern valid trajectories within the current environmental context. Experimental evaluations conducted on both reactive CARLA simulation and non-reactive log-replay simulations reveal substantial enhancements in closed-loop metrics such as \textit{Score, Progress}, and Success Rate, underscoring the effectiveness of the proposed methodology. Further evaluations against the Bench2Drive benchmark demonstrate that VL(on failure) outperforms the state-of-the-art methods by a large margin.