High order Bellman equations and weakly chained diagonally dominant tensors
Provides theoretical foundations and a practical algorithm for high-order optimal control problems, with potential impact on tensor-based dynamic programming.
The paper extends Bellman equations to high-order tensors, introduces weakly chained diagonally dominant tensors, and proves existence/uniqueness of positive solutions. A policy iteration algorithm is given, and a numerical scheme for optimal control outperforms classical approaches in speed and accuracy.
We introduce high order Bellman equations, extending classical Bellman equations to the tensor setting. We introduce weakly chained diagonally dominant (w.c.d.d.) tensors and show that a sufficient condition for the existence and uniqueness of a positive solution to a high order Bellman equation is that the tensors appearing in the equation are w.c.d.d. M-tensors. In this case, we give a policy iteration algorithm to compute this solution. We also prove that a weakly diagonally dominant Z-tensor with nonnegative diagonals is a strong M-tensor if and only if it is w.c.d.d. This last point is analogous to a corresponding result in the matrix setting and tightens a result from [L. Zhang, L. Qi, and G. Zhou. "M-tensors and some applications." SIAM Journal on Matrix Analysis and Applications (2014)]. We apply our results to obtain a provably convergent numerical scheme for an optimal control problem using an "optimize then discretize" approach which outperforms (in both computation time and accuracy) a classical "discretize then optimize" approach. To the best of our knowledge, a link between M-tensors and optimal control has not been previously established.