LG OCApr 3, 2023

Imitation Learning from Nonlinear MPC via the Exact Q-Loss and its Gauss-Newton Approximation

Andrea Ghezzi, Jasper Hoffman, Jonathan Frey, Joschka Boedecker, Moritz Diehl

arXiv:2304.01782v16.611 citationsh-index: 67

Originality Incremental advance

AI Analysis

This work addresses the challenge of constraint satisfaction in Imitation Learning for control systems, offering a domain-specific improvement over existing methods.

The authors tackled the problem of learning nonlinear Model Predictive Control policies via Imitation Learning by introducing a novel Q-function-based loss that embeds performance objectives and constraint satisfaction, which significantly reduces constraint violations while achieving comparable or better closed-loop costs compared to standard Behavioral Cloning.

This work presents a novel loss function for learning nonlinear Model Predictive Control policies via Imitation Learning. Standard approaches to Imitation Learning neglect information about the expert and generally adopt a loss function based on the distance between expert and learned controls. In this work, we present a loss based on the Q-function directly embedding the performance objectives and constraint satisfaction of the associated Optimal Control Problem (OCP). However, training a Neural Network with the Q-loss requires solving the associated OCP for each new sample. To alleviate the computational burden, we derive a second Q-loss based on the Gauss-Newton approximation of the OCP resulting in a faster training time. We validate our losses against Behavioral Cloning, the standard approach to Imitation Learning, on the control of a nonlinear system with constraints. The final results show that the Q-function-based losses significantly reduce the amount of constraint violations while achieving comparable or better closed-loop costs.

View on arXiv PDF

Similar