RO AIMay 8, 2020

Learning hierarchical behavior and motion planning for autonomous driving

Jingke Wang, Yue Wang, Dongkun Zhang, Yezhou Yang, Rong Xiong

arXiv:2005.03863v117.942 citations

Originality Incremental advance

AI Analysis

This work addresses tactical decision-making for autonomous driving systems, representing an incremental improvement by integrating classical planning methods into learning-based solutions.

The paper tackles the challenge of tactical decision-making in learning-based autonomous driving by introducing hierarchical behavior and motion planning (HBMP) to explicitly model behavior, reducing action space and diversifying rewards without losing optimality. It demonstrates effectiveness through experiments, including successful transfer to real-world environments, validating generalization capability.

Learning-based driving solution, a new branch for autonomous driving, is expected to simplify the modeling of driving by learning the underlying mechanisms from data. To improve the tactical decision-making for learning-based driving solution, we introduce hierarchical behavior and motion planning (HBMP) to explicitly model the behavior in learning-based solution. Due to the coupled action space of behavior and motion, it is challenging to solve HBMP problem using reinforcement learning (RL) for long-horizon driving tasks. We transform HBMP problem by integrating a classical sampling-based motion planner, of which the optimal cost is regarded as the rewards for high-level behavior learning. As a result, this formulation reduces action space and diversifies the rewards without losing the optimality of HBMP. In addition, we propose a sharable representation for input sensory data across simulation platforms and real-world environment, so that models trained in a fast event-based simulator, SUMO, can be used to initialize and accelerate the RL training in a dynamics based simulator, CARLA. Experimental results demonstrate the effectiveness of the method. Besides, the model is successfully transferred to the real-world, validating the generalization capability.

View on arXiv PDF

Similar