LGMay 29

Flow map learning in nonlinear vector autoregressive models: influence of the feature-library structure on the training error

arXiv:2605.3143831.0

AI Analysis

This research provides insights into the training error behavior of NVAR/NG-RC models, which is significant for researchers and practitioners using these models for time series forecasting, particularly in understanding the interplay between feature library structure, temporal resolution, and generalization.

This paper investigates the identifiability problem in nonlinear vector autoregressive (NVAR) models, also known as next-generation reservoir computers (NG-RCs), for time series forecasting. It shows that the training error's scaling laws with time resolution depend on whether the feature library can exactly or approximately represent the flow map's early Lie-series coefficients. The study found that delay terms reduce optimal one-step training error but improve long-horizon forecasts only with sufficient library nonlinearity.

Time series forecasting often requires learning nonlinear and time-delayed dependencies. A paradigmatic class of forecasting models are nonlinear vector autoregressive processes (NVAR), also known as next-generation reservoir computers (NG-RCs). These models approximate the Koopman operator on the space spanned by their explicit feature library. We consider the identifiability problem for learning Markovian nonlinear dynamical systems and show that the training error as a function of time resolution follows characteristic (pre-)asymptotic scaling laws. These laws depend on whether the feature library can represent the early Lie-series coefficients of the flow map (propagator) exactly or merely approximately. For dynamical systems governed by polynomial vector fields, we demonstrate the mechanism for NVAR/NG-RC models with monomial and Fourier feature libraries. We determine the dependence of the training error on the temporal resolution, the involved nonlinear degree, and the number of delay terms. While delay terms reduce the optimal one-step training error, they improve long-horizon forecasts only when the library provides sufficient nonlinearity. Thus, small training error coexists with weak generalization as the model class is mismatched to the true data-generating process. Numerical experiments on various chaotic dynamical systems confirm the theoretical predictions.

View on arXiv PDF

Similar