ROLGSYOct 1, 2021

Motion Planning for Autonomous Vehicles in the Presence of Uncertainty Using Reinforcement Learning

arXiv:2110.00640v126 citations
Originality Highly original
AI Analysis

This addresses the problem of conservative and computationally expensive planning for autonomous vehicles in uncertain environments, offering a novel method for safety-critical applications.

The paper tackles motion planning for autonomous vehicles under sensing and perception uncertainty by proposing a reinforcement learning approach that optimizes for worst-case outcomes, resulting in improved planning behavior comparable to human driving in simulated scenarios.

Motion planning under uncertainty is one of the main challenges in developing autonomous driving vehicles. In this work, we focus on the uncertainty in sensing and perception, resulted from a limited field of view, occlusions, and sensing range. This problem is often tackled by considering hypothetical hidden objects in occluded areas or beyond the sensing range to guarantee passive safety. However, this may result in conservative planning and expensive computation, particularly when numerous hypothetical objects need to be considered. We propose a reinforcement learning (RL) based solution to manage uncertainty by optimizing for the worst case outcome. This approach is in contrast to traditional RL, where the agents try to maximize the average expected reward. The proposed approach is built on top of the Distributional RL with its policy optimization maximizing the stochastic outcomes' lower bound. This modification can be applied to a range of RL algorithms. As a proof-of-concept, the approach is applied to two different RL algorithms, Soft Actor-Critic and DQN. The approach is evaluated against two challenging scenarios of pedestrians crossing with occlusion and curved roads with a limited field of view. The algorithm is trained and evaluated using the SUMO traffic simulator. The proposed approach yields much better motion planning behavior compared to conventional RL algorithms and behaves comparably to humans driving style.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes