LGApr 2, 2023
Infinite-dimensional reservoir computingLukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega
Reservoir computing approximation and generalization bounds are proved for a new concept class of input/output systems that extends the so-called generalized Barron functionals to a dynamic context. This new class is characterized by the readouts with a certain integral representation built on infinite-dimensional state-space systems. It is shown that this class is very rich and possesses useful features and universal approximation properties. The reservoir architectures used for the approximation and estimation of elements in the new class are randomly generated echo state networks with either linear or ReLU activation functions. Their readouts are built using randomly generated neural networks in which only the output layer is trained (extreme learning machines or random feature neural networks). The results in the paper yield a fully implementable recurrent neural network-based learning algorithm with provable convergence guarantees that do not suffer from the curse of dimensionality.
LGDec 30, 2022
Reservoir kernels and Volterra seriesLukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega
A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter, and it is hence called the Volterra reservoir kernel. Even though the state-space representation and the corresponding reservoir feature map are defined on an infinite-dimensional tensor algebra space, the kernel map is characterized by explicit recursions that are readily computable for specific data sets when employed in estimation problems using the representer theorem. The empirical performance of the Volterra reservoir kernel is showcased and compared to other standard static and sequential kernels in a multidimensional and highly nonlinear learning task for the conditional covariances of financial asset returns.
LGDec 13, 2024
Infinite-dimensional next-generation reservoir computingLyudmila Grigoryeva, Hannah Lim Jing Ting, Juan-Pablo Ortega
Next-generation reservoir computing (NG-RC) has attracted much attention due to its excellent performance in spatio-temporal forecasting of complex systems and its ease of implementation. This paper shows that NG-RC can be encoded as a kernel ridge regression that makes training efficient and feasible even when the space of chosen polynomial features is very large. Additionally, an extension to an infinite number of covariates is possible, which makes the methodology agnostic with respect to the lags into the past that are considered as explanatory factors, as well as with respect to the number of polynomial covariates, an important hyperparameter in traditional NG-RC. We show that this approach has solid theoretical backing and good behavior based on kernel universality properties previously established in the literature. Various numerical illustrations show that these generalizations of NG-RC outperform the traditional approach in several forecasting applications.
LGFeb 7, 2025
Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?Giovanni Ballarin, Lyudmila Grigoryeva, Juan-Pablo Ortega
The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix, and it is almost surely maximal for connectivity and input weight matrices drawn from regular distributions. This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of stochastic signals. This work shows that the MC of random nonlinear RNNs yields arbitrary values within established upper and lower bounds depending exclusively on the scale of the input process. This confirms that the existing definition of MC in linear and nonlinear cases has no practical value.
LGMay 2, 2023
Memory of recurrent networks: Do we compute it right?Giovanni Ballarin, Lyudmila Grigoryeva, Juan-Pablo Ortega
Numerical evaluations of the memory capacity (MC) of recurrent neural networks reported in the literature often contradict well-established theoretical bounds. In this paper, we study the case of linear echo state networks, for which the total memory capacity has been proven to be equal to the rank of the corresponding Kalman controllability matrix. We shed light on various reasons for the inaccurate numerical estimations of the memory, and we show that these issues, often overlooked in the recent literature, are of an exclusively numerical nature. More explicitly, we prove that when the Krylov structure of the linear MC is ignored, a gap between the theoretical MC and its empirical counterpart is introduced. As a solution, we develop robust numerical approaches by exploiting a result of MC neutrality with respect to the input mask matrix. Simulations show that the memory curves that are recovered using the proposed methods fully agree with the theory.
DSAug 11, 2021
Learning strange attractors with reservoir systemsLyudmila Grigoryeva, Allen Hart, Juan-Pablo Ortega
This paper shows that the celebrated Embedding Theorem of Takens is a particular case of a much more general statement according to which, randomly generated linear state-space representations of generic observations of an invertible dynamical system carry in their wake an embedding of the phase space dynamics into the chosen Euclidean state space. This embedding coincides with a natural generalized synchronization that arises in this setup and that yields a topological conjugacy between the state-space dynamics driven by the generic observations of the dynamical system and the dynamical system itself. This result provides additional tools for the representation, learning, and analysis of chaotic attractors and sheds additional light on the reservoir computing phenomenon that appears in the context of recurrent neural networks.
NESep 17, 2020
Discrete-time signatures and randomness in reservoir computingChrista Cuchiero, Lukas Gonon, Lyudmila Grigoryeva et al.
A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projections of a family of state-space systems that generate Volterra series expansions. This procedure yields a state-affine reservoir system with randomly generated coefficients in a dimension that is logarithmically reduced with respect to the original system. This reservoir system is able to approximate any element in the fading memory filters class just by training a different linear readout for each different filter. Explicit expressions for the probability distributions needed in the generation of the projected reservoir system are stated and bounds for the committed approximation error are provided.
OCJul 23, 2020
Dimension reduction in recurrent networks by canonicalizationLyudmila Grigoryeva, Juan-Pablo Ortega
Many recurrent neural network machine learning paradigms can be formulated using state-space representations. The classical notion of canonical state-space realization is adapted in this paper to accommodate semi-infinite inputs so that it can be used as a dimension reduction tool in the recurrent networks setup. The so-called input forgetting property is identified as the key hypothesis that guarantees the existence and uniqueness (up to system isomorphisms) of canonical realizations for causal and time-invariant input/output systems with semi-infinite inputs. Additionally, the notion of optimal reduction coming from the theory of symmetric Hamiltonian systems is implemented in our setup to construct canonical realizations out of input forgetting but not necessarily canonical ones. These two procedures are studied in detail in the framework of linear fading memory input/output systems. Finally, the notion of implicit reduction using reproducing kernel Hilbert spaces (RKHS) is introduced which allows, for systems with linear readouts, to achieve dimension reduction without the need to actually compute the reduced spaces introduced in the first part of the paper.
OCApr 22, 2020
Memory and forecasting capacities of nonlinear recurrent networksLukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega
The notion of memory capacity, originally introduced for echo state and linear networks with independent inputs, is generalized to nonlinear recurrent networks with stationary but dependent inputs. The presence of dependence in the inputs makes natural the introduction of the network forecasting capacity, that measures the possibility of forecasting time series values using network states. Generic bounds for memory and forecasting capacities are formulated in terms of the number of neurons of the nonlinear recurrent network and the autocovariance function or the spectral density of the input. These bounds generalize well-known estimates in the literature to a dependent inputs setup. Finally, for the particular case of linear recurrent networks with independent inputs it is proved that the memory capacity is given by the rank of the associated controllability matrix, a fact that has been for a long time assumed to be true without proof by the community.
PRFeb 14, 2020
Approximation Bounds for Random Neural Networks and Reservoir SystemsLukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega
This work studies approximation based on single-hidden-layer feedforward and recurrent neural networks with randomly generated internal weights. These methods, in which only the last layer of weights and a few hyperparameters are optimized, have been successfully applied in a wide range of static and dynamic learning problems. Despite the popularity of this approach in empirical tasks, important theoretical questions regarding the relation between the unknown function, the weight distribution, and the approximation rate have remained open. In this work it is proved that, as long as the unknown function, functional, or dynamical system is sufficiently regular, it is possible to draw the internal weights of the random (recurrent) neural network from a generic distribution (not depending on the unknown object) and quantify the error in terms of the number of neurons and the hyperparameters. In particular, this proves that echo state networks with randomly generated weights are capable of approximating a wide class of dynamical systems arbitrarily well and thus provides the first mathematical explanation for their empirically observed success at learning dynamical systems.
LGOct 30, 2019
Risk bounds for reservoir computingLukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega
We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the multivariate Rademacher complexities of the reservoir systems and the weak dependence structure of the signals that are being handled. This allows, in particular, to determine the minimal number of observations needed in order to guarantee a prescribed estimation accuracy with high probability for a given reservoir family. At the same time, the asymptotic behavior of the devised bounds guarantees the consistency of the empirical risk minimization procedure for various hypothesis classes of reservoir functionals.
NEFeb 16, 2019
Differentiable reservoir computingLyudmila Grigoryeva, Juan-Pablo Ortega
Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the differentiability of reservoir filters for very general classes of discrete-time deterministic inputs. This constitutes a novel strong contribution to the long line of research on the ESP and the FMP and, in particular, links to existing research on the input-dependence of the ESP. Differentiability has been shown in the literature to be a key feature in the learning of attractors of chaotic dynamical systems. A Volterra-type series representation for reservoir filters with semi-infinite discrete-time inputs is constructed in the analytic case using Taylor's theorem and corresponding approximation bounds are provided. Finally, it is shown as a corollary of these results that any fading memory filter can be uniformly approximated by a finite Volterra series with finite memory.
NEJun 3, 2018
Echo state networks are universalLyudmila Grigoryeva, Juan-Pablo Ortega
This paper shows that echo state networks are universal uniform approximants in the context of discrete-time fading memory filters with uniformly bounded inputs defined on negative infinite times. This result guarantees that any fading memory input/output system in discrete time can be realized as a simple finite-dimensional neural network-type state-space model with a static linear readout map. This approximation is valid for infinite time intervals. The proof of this statement is based on fundamental results, also presented in this work, about the topological nature of the fading memory property and about reservoir computing systems generated by continuous reservoir maps.
NEDec 3, 2017
Universal discrete-time reservoir computers with stochastic inputs and linear readouts using non-homogeneous state-affine systemsLyudmila Grigoryeva, Juan-Pablo Ortega
A new class of non-homogeneous state-affine systems is introduced for use in reservoir computing. Sufficient conditions are identified that guarantee first, that the associated reservoir computers with linear readouts are causal, time-invariant, and satisfy the fading memory property and second, that a subset of this class is universal in the category of fading memory filters with stochastic almost surely uniformly bounded inputs. This means that any discrete-time filter that satisfies the fading memory property with random inputs of that type can be uniformly approximated by elements in the non-homogeneous state-affine family.
MLMay 29, 2016
Singular ridge regression with homoscedastic residuals: generalization error with estimated parametersLyudmila Grigoryeva, Juan-Pablo Ortega
This paper characterizes the conditional distribution properties of the finite sample ridge regression estimator and uses that result to evaluate total regression and generalization errors that incorporate the inaccuracies committed at the time of parameter estimation. The paper provides explicit formulas for those errors. Unlike other classical references in this setup, our results take place in a fully singular setup that does not assume the existence of a solution for the non-regularized regression problem. In exchange, we invoke a conditional homoscedasticity hypothesis on the regularized regression residuals that is crucial in our developments.
NEOct 13, 2015
Nonlinear memory capacity of parallel time-delay reservoir computers in the processing of multidimensional signalsLyudmila Grigoryeva, Julie Henriques, Laurent Larger et al.
This paper addresses the reservoir design problem in the context of delay-based reservoir computers for multidimensional input signals, parallel architectures, and real-time multitasking. First, an approximating reservoir model is presented in those frameworks that provides an explicit functional link between the reservoir parameters and architecture and its performance in the execution of a specific task. Second, the inference properties of the ridge regression estimator in the multivariate context is used to assess the impact of finite sample training on the decrease of the reservoir capacity. Finally, an empirical study is conducted that shows the adequacy of the theoretical results with the empirical performances exhibited by various reservoir architectures in the execution of several nonlinear tasks with multidimensional inputs. Our results confirm the robustness properties of the parallel reservoir architecture with respect to task misspecification and parameter choice that had already been documented in the literature.
ETAug 1, 2015
Quantitative evaluation of the performance of discrete-time reservoir computers in the forecasting, filtering, and reconstruction of stochastic stationary signalsLyudmila Grigoryeva, Julie Henriques, Juan-Pablo Ortega
This paper extends the notion of information processing capacity for non-independent input signals in the context of reservoir computing (RC). The presence of input autocorrelation makes worthwhile the treatment of forecasting and filtering problems for which we explicitly compute this generalized capacity as a function of the reservoir parameter values using a streamlined model. The reservoir model leading to these developments is used to show that, whenever that approximation is valid, this computational paradigm satisfies the so called separation and fading memory properties that are usually associated with good information processing performances. We show that several standard memory, forecasting, and filtering problems that appear in the parametric stochastic time series context can be readily formulated and tackled via RC which, as we show, significantly outperforms standard techniques in some instances.