LGNEDSAOJul 11, 2024

How more data can hurt: Instability and regularization in next-generation reservoir computing

arXiv:2407.08641v39 citationsh-index: 3
AI Analysis

This addresses a critical issue for researchers modeling dynamical systems with data-driven methods, though it is incremental as it builds on known phenomena in deep learning.

The paper tackles the problem of data-induced instability in next-generation reservoir computing (NGRC) for dynamical systems, showing that more training data can lead to worse performance due to ill-conditioned integrators, and proposes regularization and noise strategies to mitigate this, achieving improved stability in experiments.

It has been found recently that more data can, counter-intuitively, hurt the performance of deep neural networks. Here, we show that a more extreme version of the phenomenon occurs in data-driven models of dynamical systems. To elucidate the underlying mechanism, we focus on next-generation reservoir computing (NGRC) -- a popular framework for learning dynamics from data. We find that, despite learning a better representation of the flow map with more training data, NGRC can adopt an ill-conditioned ``integrator'' and lose stability. We link this data-induced instability to the auxiliary dimensions created by the delayed states in NGRC. Based on these findings, we propose simple strategies to mitigate the instability, either by increasing regularization strength in tandem with data size, or by carefully introducing noise during training. Our results highlight the importance of proper regularization in data-driven modeling of dynamical systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes