LGSYMar 26, 2021

Improved Initialization of State-Space Artificial Neural Networks

arXiv:2103.14516v130 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in system identification for researchers and practitioners, but it is incremental as it builds on existing initialization methods.

The paper tackles the problem of poor local minima in training state-space neural networks by introducing an improved initialization method that uses a linear approximation for some weights and random values or zeros for others, demonstrating its effectiveness on two benchmark examples.

The identification of black-box nonlinear state-space models requires a flexible representation of the state and output equation. Artificial neural networks have proven to provide such a representation. However, as in many identification problems, a nonlinear optimization problem needs to be solved to obtain the model parameters (layer weights and biases). A well-thought initialization of these model parameters can often avoid that the nonlinear optimization algorithm converges to a poorly performing local minimum of the considered cost function. This paper introduces an improved initialization approach for nonlinear state-space models represented as a recurrent artificial neural network and emphasizes the importance of including an explicit linear term in the model structure. Some of the neural network weights are initialized starting from a linear approximation of the nonlinear system, while others are initialized using random values or zeros. The effectiveness of the proposed initialization approach over previously proposed methods is illustrated on two benchmark examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes