Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning
This work provides a method for making Deep Reinforcement Learning-based motion planners more robust for safety-critical systems like autonomous vehicles, which is an incremental improvement for the field.
This paper addresses the challenge of robust motion planning for autonomous driving by integrating a Deep Q-Network as a heuristic into a search algorithm. The proposed Experience-Based-Heuristic-Search algorithm overcomes the statistical failure rates of purely Deep Reinforcement Learning-based planners while retaining computational benefits from pre-learned optimal policies. It was benchmarked in valet parking scenarios, demonstrating computational advantages and robustness.
Interaction-aware planning for autonomous driving requires an exploration of a combinatorial solution space when using conventional search- or optimization-based motion planners. With Deep Reinforcement Learning, optimal driving strategies for such problems can be derived also for higher-dimensional problems. However, these methods guarantee optimality of the resulting policy only in a statistical sense, which impedes their usage in safety critical systems, such as autonomous vehicles. Thus, we propose the Experience-Based-Heuristic-Search algorithm, which overcomes the statistical failure rate of a Deep-reinforcement-learning-based planner and still benefits computationally from the pre-learned optimal policy. Specifically, we show how experiences in the form of a Deep Q-Network can be integrated as heuristic into a heuristic search algorithm. We benchmark our algorithm in the field of path planning in semi-structured valet parking scenarios. There, we analyze the accuracy of such estimates and demonstrate the computational advantages and robustness of our method. Our method may encourage further investigation of the applicability of reinforcement-learning-based planning in the field of self-driving vehicles.