Training Recurrent Neural Networks by Sequential Least Squares and the Alternating Direction Method of Multipliers
This addresses the problem of efficiently training neural networks with complex regularization for researchers in system identification, though it appears incremental as it builds on existing optimization methods.
The paper tackles training recurrent neural networks for nonlinear dynamical systems by proposing a novel algorithm combining sequential least squares with ADMM to handle convex and non-smooth regularization, achieving effective results in three system identification problems.
This paper proposes a novel algorithm for training recurrent neural network models of nonlinear dynamical systems from an input/output training dataset. Arbitrary convex and twice-differentiable loss functions and regularization terms are handled by sequential least squares and either a line-search (LS) or a trust-region method of Levenberg-Marquardt (LM) type for ensuring convergence. In addition, to handle non-smooth regularization terms such as $\ell_1$, $\ell_0$, and group-Lasso regularizers, as well as to impose possibly non-convex constraints such as integer and mixed-integer constraints, we combine sequential least squares with the alternating direction method of multipliers (ADMM). We call the resulting algorithm NAILS (nonconvex ADMM iterations and least squares) in the case line search (LS) is used, or NAILM if a trust-region method (LM) is employed instead. The training method, which is also applicable to feedforward neural networks as a special case, is tested in three nonlinear system identification problems.