Tailored neural networks for learning optimal value functions in MPC
This work provides a method for improving learning-based predictive control in linear MPC, but it is incremental as it extends prior results from optimal control policies to value functions.
The paper tackles the problem of efficiently learning optimal value functions and Q-functions in linear model predictive control (MPC) by proposing tailored neural networks that exploit their piecewise quadratic structure, enabling exact representation.
Learning-based predictive control is a promising alternative to optimization-based MPC. However, efficiently learning the optimal control policy, the optimal value function, or the Q-function requires suitable function approximators. Often, artificial neural networks (ANN) are considered but choosing a suitable topology is also non-trivial. Against this background, it has recently been shown that tailored ANN allow, in principle, to exactly describe the optimal control policy in linear MPC by exploiting its piecewise affine structure. In this paper, we provide a similar result for representing the optimal value function and the Q-function that are both known to be piecewise quadratic for linear MPC.