Learning neural state-space models: do we need a state estimator?
This work addresses a practical issue in neural state-space model training for system identification, offering guidance that can reduce computational overhead in some cases, though it is incremental in nature.
The paper investigates whether advanced initial state estimation is necessary for training neural state-space models in system identification, finding that it is crucial for certain dynamical systems but not for asymptotically stable ones where basic methods suffice.
In recent years, several algorithms for system identification with neural state-space models have been introduced. Most of the proposed approaches are aimed at reducing the computational complexity of the learning problem, by splitting the optimization over short sub-sequences extracted from a longer training dataset. Different sequences are then processed simultaneously within a minibatch, taking advantage of modern parallel hardware for deep learning. An issue arising in these methods is the need to assign an initial state for each of the sub-sequences, which is required to run simulations and thus to evaluate the fitting loss. In this paper, we provide insights for calibration of neural state-space training algorithms based on extensive experimentation and analyses performed on two recognized system identification benchmarks. Particular focus is given to the choice and the role of the initial state estimation. We demonstrate that advanced initial state estimation techniques are really required to achieve high performance on certain classes of dynamical systems, while for asymptotically stable ones basic procedures such as zero or random initialization already yield competitive performance.