Recurrent Neural Network Training with Convex Loss and Regularization Functions by Extended Kalman Filtering
This addresses training challenges for recurrent neural networks in control and system identification, but it is incremental as it adapts an existing filtering method to neural networks.
The paper tackles training recurrent neural networks with convex loss and regularization using extended Kalman filtering, showing it is competitive with stochastic gradient descent in benchmarks like nonlinear system identification and linear systems with binary outputs.
This paper investigates the use of extended Kalman filtering to train recurrent neural networks with rather general convex loss functions and regularization terms on the network parameters, including $\ell_1$-regularization. We show that the learning method is competitive with respect to stochastic gradient descent in a nonlinear system identification benchmark and in training a linear system with binary outputs. We also explore the use of the algorithm in data-driven nonlinear model predictive control and its relation with disturbance models for offset-free closed-loop tracking.