LGMLMar 27, 2018

Safe end-to-end imitation learning for model predictive control

arXiv:1803.10231v329 citations
Originality Incremental advance
AI Analysis

This addresses safety concerns in autonomous systems like self-driving cars by preventing failures when encountering novel situations, though it is incremental as it builds on existing imitation and reinforcement learning techniques.

The paper tackles the problem of ensuring safety in learned control policies when test inputs differ from training data by using Bayesian networks to provide uncertainty estimates, combining reinforcement and imitation learning to learn both a policy and an uncertainty threshold without hand-tuning. The method was validated on cart-pole and autonomous driving simulations, showing robustness to varying dynamics and partial observability.

We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We validate our method on fully-observable and vision-based partially-observable systems using cart-pole and autonomous driving simulations using deep convolutional Bayesian neural networks. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes