LG MLMar 27, 2018

Safe end-to-end imitation learning for model predictive control

Keuntaek Lee, Kamil Saigol, Evangelos A. Theodorou

arXiv:1803.10231v310.829 citations

Originality Incremental advance

AI Analysis

This addresses safety concerns in autonomous systems like self-driving cars by preventing failures when encountering novel situations, though it is incremental as it builds on existing imitation and reinforcement learning techniques.

The paper tackles the problem of ensuring safety in learned control policies when test inputs differ from training data by using Bayesian networks to provide uncertainty estimates, combining reinforcement and imitation learning to learn both a policy and an uncertainty threshold without hand-tuning. The method was validated on cart-pole and autonomous driving simulations, showing robustness to varying dynamics and partial observability.

We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We validate our method on fully-observable and vision-based partially-observable systems using cart-pole and autonomous driving simulations using deep convolutional Bayesian neural networks. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability.

View on arXiv PDF

Similar