Juan-Pablo Ortega

h-index17

31papers

1,057citations

Novelty50%

AI Score56

Ranked #21,987 of 201,326 authors (top 11%)#5,018 in LG (top 12%)

31 Papers

LGApr 2, 2023

Infinite-dimensional reservoir computing

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Reservoir computing approximation and generalization bounds are proved for a new concept class of input/output systems that extends the so-called generalized Barron functionals to a dynamic context. This new class is characterized by the readouts with a certain integral representation built on infinite-dimensional state-space systems. It is shown that this class is very rich and possesses useful features and universal approximation properties. The reservoir architectures used for the approximation and estimation of elements in the new class are randomly generated echo state networks with either linear or ReLU activation functions. Their readouts are built using randomly generated neural networks in which only the output layer is trained (extreme learning machines or random feature neural networks). The results in the paper yield a fully implementable recurrent neural network-based learning algorithm with provable convergence guarantees that do not suffer from the curse of dimensionality.

LGDec 30, 2022

Reservoir kernels and Volterra series

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter, and it is hence called the Volterra reservoir kernel. Even though the state-space representation and the corresponding reservoir feature map are defined on an infinite-dimensional tensor algebra space, the kernel map is characterized by explicit recursions that are readily computable for specific data sets when employed in estimation problems using the representer theorem. The empirical performance of the Volterra reservoir kernel is showcased and compared to other standard static and sequential kernels in a multidimensional and highly nonlinear learning task for the conditional covariances of financial asset returns.

LGFeb 21, 2023

The Gaussian kernel on the circle and spaces that admit isometric embeddings of the circle

Nathaël Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega

On Euclidean spaces, the Gaussian kernel is one of the most widely used kernels in applications. It has also been used on non-Euclidean spaces, where it is known that there may be (and often are) scale parameters for which it is not positive definite. Hope remains that this kernel is positive definite for many choices of parameter. However, we show that the Gaussian kernel is not positive definite on the circle for any choice of parameter. This implies that on metric spaces in which the circle can be isometrically embedded, such as spheres, projective spaces and Grassmannians, the Gaussian kernel is not positive definite for any parameter.

87.3DSMay 24

Data-Specific Hyper-Parameter Design: A Paradigm Shift in Reservoir Computing

G Manjunath, Juan-Pablo Ortega, Alma van der Merwe

Reservoir computing typically relies on large, randomly generated reservoirs, enabling simple, often linear readouts. Over the past two decades, most constructions have exploited the freedom to select the reservoir, constrained primarily by stability conditions based on state contraction or memory capacity. However, these designs are largely independent of the input data and learning objective, resulting in a trial-and-error methodology driven by randomness. In high dimensions, the reservoir acts as a random embedding of the input history, implicitly relying on Johnson--Lindenstrauss--type concentration phenomena to preserve information. In contrast, we develop reservoir design principles from a geometric perspective for inputs generated by deterministic dynamical systems. Rather than relying on random embeddings, we require reservoir state increments to align within a cone around an input-determined vector subspace, and prove that such a cone concentration reduces ridge-regression training error. When the cone angle is small, the variance of reservoir states concentrates in the input-determined subspace, improving conditioning of the empirical second-moment matrix and strengthening alignment between dominant covariance directions and the state-target cross-covariance. For echo state networks, we provide a constructive approach to reservoir design. The reservoir matrix is chosen so that associated Krylov-chain directions remain nearly closed within an input-determined subspace while permitting controlled mixing in its orthogonal complement. We also provide a spectral diagnostic for ridge regression training that identifies when reservoir geometry concentrates predictive information into a few dominant covariance modes and when ``spectral pollution'' inhibits forecasting. Numerical experiments demonstrate consistent performance gains over arbitrary reservoir constructions.

LGOct 30, 2023

Invariant kernels on Riemannian symmetric spaces: a harmonic-analytic approach

Nathael Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega et al.

This work aims to prove that the classical Gaussian kernel, when defined on a non-Euclidean symmetric space, is never positive-definite for any choice of parameter. To achieve this goal, the paper develops new geometric and analytical arguments. These provide a rigorous characterization of the positive-definiteness of the Gaussian kernel, which is complete but for a limited number of scenarios in low dimensions that are treated by numerical computations. Chief among these results are the L$^{\!\scriptscriptstyle p}$-$\hspace{0.02cm}$Godement theorems (where $p = 1,2$), which provide verifiable necessary and sufficient conditions for a kernel defined on a symmetric space of non-compact type to be positive-definite. A celebrated theorem, sometimes called the Bochner-Godement theorem, already gives such conditions and is far more general in its scope, but is especially hard to apply. Beyond the connection with the Gaussian kernel, the new results in this work lay out a blueprint for the study of invariant kernels on symmetric spaces, bringing forth specific harmonic analysis tools that suggest many future applications.

LGOct 20, 2023

Geometric Learning with Positively Decomposable Kernels

Nathael Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega et al.

Kernel methods are powerful tools in machine learning. Classical kernel methods are based on positive-definite kernels, which map data spaces into reproducing kernel Hilbert spaces (RKHS). For non-Euclidean data spaces, positive-definite kernels are difficult to come by. In this case, we propose the use of reproducing kernel Krein space (RKKS) based methods, which require only kernels that admit a positive decomposition. We show that one does not need to access this decomposition in order to learn in RKKS. We then investigate the conditions under which a kernel is positively decomposable. We show that invariant kernels admit a positive decomposition on homogeneous spaces under tractable regularity assumptions. This makes them much easier to construct than positive-definite kernels, providing a route for learning with kernels for non-Euclidean data. By the same token, this provides theoretical foundations for RKKS-based methods in general.

OCAug 14, 2024

Fading memory and the convolution theorem

Juan-Pablo Ortega, Florian Rossmannek

Several topological and analytical notions of continuity and fading memory for causal and time-invariant filters are introduced, and the relations between them are analyzed. A significant generalization of the convolution theorem that establishes the equivalence between the fading memory property and the availability of convolution representations of linear filters is proved. This result extends a previous similar characterization to a complete array of weighted norms in the definition of the fading memory property. Additionally, the main theorem shows that the availability of convolution representations can be characterized, at least when the codomain is finite-dimensional, not only by the fading memory property but also by the reunion of two purely topological notions that are called minimal continuity and minimal fading memory property. Finally, when the input space and the codomain of a linear functional are Hilbert spaces, it is shown that minimal continuity and the minimal fading memory property guarantee the existence of interesting embeddings of the associated reproducing kernel Hilbert spaces.

65.3NAMay 10

Kernel Learning of PDE Solution Operators

Jianyu Hu, Juan-Pablo Ortega

A kernel-based approach for the learning of the solution operator of general nonhomogeneous partial differential equations (PDEs) is proposed. The method incorporates physical priors, typically encoded through the PDE operator, into a kernel ridge regression framework, and employs a regularization-based formulation to construct an operator learner. This yields a closed-form estimator that is independent of the input functions that determine the underlying PDE. From the perspective of regularization theory, the resulting estimator induces a well-defined operator that links input and output spaces, which contain the functions that define a Dirichlet problem and its solution, respectively. Consequently, it effectively shifts from a PDE solver to an operator-based solver. In contrast to standard supervised learning methods, it does not rely on paired input--output training data and enables systematic extrapolation beyond observed regimes. A full error analysis is conducted, providing convergence rates for the operator-based solver under suitable choices of regularization parameters. Extensive numerical experiments, including Darcy flow and Helmholtz equations, demonstrate that the proposed method achieves high accuracy and efficiency across a range of problem settings, and compares favorably with operator learning approaches in both approximation quality and computational cost.

MLMar 15, 2024

A Structure-Preserving Kernel Method for Learning Hamiltonian Systems

Jianyu Hu, Juan-Pablo Ortega, Daiying Yin

A structure-preserving kernel ridge regression method is presented that allows the recovery of nonlinear Hamiltonian functions out of datasets made of noisy observations of Hamiltonian vector fields. The method proposes a closed-form solution that yields excellent numerical performances that surpass other techniques proposed in the literature in this setup. From the methodological point of view, the paper extends kernel regression methods to problems in which loss functions involving linear functions of gradients are required and, in particular, a differential reproducing property and a Representer Theorem are proved in this context. The relation between the structure-preserving kernel estimator and the Gaussian posterior mean estimator is analyzed. A full error analysis is conducted that provides convergence rates using fixed and adaptive regularization parameters. The good performance of the proposed estimator together with the convergence rate is illustrated with various numerical experiments.

LGDec 13, 2024

Infinite-dimensional next-generation reservoir computing

Lyudmila Grigoryeva, Hannah Lim Jing Ting, Juan-Pablo Ortega

Next-generation reservoir computing (NG-RC) has attracted much attention due to its excellent performance in spatio-temporal forecasting of complex systems and its ease of implementation. This paper shows that NG-RC can be encoded as a kernel ridge regression that makes training efficient and feasible even when the space of chosen polynomial features is very large. Additionally, an extension to an infinite number of covariates is possible, which makes the methodology agnostic with respect to the lags into the past that are considered as explanatory factors, as well as with respect to the number of polynomial covariates, an important hyperparameter in traditional NG-RC. We show that this approach has solid theoretical backing and good behavior based on kernel universality properties previously established in the literature. Various numerical illustrations show that these generalizations of NG-RC outperform the traditional approach in several forecasting applications.

MLApr 12, 2024

State-space systems as dynamic generative models

Juan-Pablo Ortega, Florian Rossmannek

A probabilistic framework to study the dependence structure induced by deterministic discrete-time state-space systems between input and output processes is introduced. General sufficient conditions are formulated under which output processes exist and are unique once an input process has been fixed, a property that in the deterministic state-space literature is known as the echo state property. When those conditions are satisfied, the given state-space system becomes a generative model for probabilistic dependences between two sequence spaces. Moreover, those conditions guarantee that the output depends continuously on the input when using the Wasserstein metric. The output processes whose existence is proved are shown to be causal in a specific sense and to generalize those studied in purely deterministic situations. The results in this paper constitute a significant stochastic generalization of sufficient conditions for the deterministic echo state property to hold, in the sense that the stochastic echo state property can be satisfied under contractivity conditions that are strictly weaker than those in deterministic situations. This means that state-space systems can induce a purely probabilistic dependence structure between input and output sequence spaces even when there is no functional relation between those two spaces.

QUANT-PHJun 19, 2025

Feedback-driven recurrent quantum neural network universality

Lukas Gonon, Rodrigo Martínez-Peña, Juan-Pablo Ortega

Quantum reservoir computing uses the dynamics of quantum systems to process temporal data, making it particularly well-suited for learning with noisy intermediate-scale quantum devices. Early experimental proposals, such as the restarting and rewinding protocols, relied on repeating previous steps of the quantum map to avoid backaction. However, this approach compromises real-time processing and increases computational overhead. Recent developments have introduced alternative protocols that address these limitations. These include online, mid-circuit measurement, and feedback techniques, which enable real-time computation while preserving the input history. Among these, the feedback protocol stands out for its ability to process temporal information with comparatively fewer components. Despite this potential advantage, the theoretical foundations of feedback-based quantum reservoir computing remain underdeveloped, particularly with regard to the universality and the approximation capabilities of this approach. This paper addresses this issue by presenting a recurrent quantum neural network architecture that extends a class of existing feedforward models to a dynamic, feedback-driven reservoir setting. We provide theoretical guarantees for variational recurrent quantum neural networks, including approximation bounds and universality results. Notably, our analysis demonstrates that the model is universal with linear readouts, making it both powerful and experimentally accessible. These results pave the way for practical and theoretically grounded quantum reservoir computing with real-time processing capabilities.

MLAug 26, 2025

Echoes of the past: A unified perspective on fading memory and echo states

Juan-Pablo Ortega, Florian Rossmannek

Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data. A fundamental property of RNNs is their ability to create reliable input/output responses, often linked to how the network handles its memory of the information it processed. Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory. Although these notions are often used interchangeably, their precise relationships remain unclear. This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results. By clarifying the relationships between these concepts, this research contributes to a deeper understanding of RNNs and their temporal information processing capabilities.

MLAug 11, 2025

Stochastic dynamics learning with state-space systems

Juan-Pablo Ortega, Florian Rossmannek

This work advances the theoretical foundations of reservoir computing (RC) by providing a unified treatment of fading memory and the echo state property (ESP) in both deterministic and stochastic settings. We investigate state-space systems, a central model class in time series learning, and establish that fading memory and solution stability hold generically -- even in the absence of the ESP -- offering a robust explanation for the empirical success of RC models without strict contractivity conditions. In the stochastic case, we critically assess stochastic echo states, proposing a novel distributional perspective rooted in attractor dynamics on the space of probability distributions, which leads to a rich and coherent theory. Our results extend and generalize previous work on non-autonomous dynamical systems, offering new insights into causality, stability, and memory in RC models. This lays the groundwork for reliable generative modeling of temporal data in both deterministic and stochastic regimes.

LGFeb 7, 2025

Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?

Giovanni Ballarin, Lyudmila Grigoryeva, Juan-Pablo Ortega

The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix, and it is almost surely maximal for connectivity and input weight matrices drawn from regular distributions. This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of stochastic signals. This work shows that the MC of random nonlinear RNNs yields arbitrary values within established upper and lower bounds depending exclusively on the scale of the input process. This confirms that the existing definition of MC in linear and nonlinear cases has no practical value.

MLSep 1, 2023

Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives

Marcel Hirt, Domenico Campolo, Victoria Leong et al.

Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations that jointly explain multiple modalities. Various objective functions for such models have been suggested, often motivated as lower bounds on the multi-modal data log-likelihood or from information-theoretic considerations. To encode latent variables from different modality subsets, Product-of-Experts (PoE) or Mixture-of-Experts (MoE) aggregation schemes have been routinely used and shown to yield different trade-offs, for instance, regarding their generative quality or consistency across multiple modalities. In this work, we consider a variational objective that can tightly approximate the data log-likelihood. We develop more flexible aggregation schemes that avoid the inductive biases in PoE or MoE approaches by combining encoded features from different modalities based on permutation-invariant neural networks. Our numerical experiments illustrate trade-offs for multi-modal variational objectives and various aggregation schemes. We show that our variational objective and more flexible aggregation models can become beneficial when one wants to approximate the true joint distribution over observed modalities and latent variables in identifiable models.

LGMay 2, 2023

Memory of recurrent networks: Do we compute it right?

Giovanni Ballarin, Lyudmila Grigoryeva, Juan-Pablo Ortega

Numerical evaluations of the memory capacity (MC) of recurrent neural networks reported in the literature often contradict well-established theoretical bounds. In this paper, we study the case of linear echo state networks, for which the total memory capacity has been proven to be equal to the rank of the corresponding Kalman controllability matrix. We shed light on various reasons for the inaccurate numerical estimations of the memory, and we show that these issues, often overlooked in the recent literature, are of an exclusively numerical nature. More explicitly, we prove that when the Krylov structure of the linear MC is ignored, a gap between the theoretical MC and its empirical counterpart is introduced. As a solution, we develop robust numerical approaches by exploiting a result of MC neutrality with respect to the input mask matrix. Simulations show that the memory curves that are recovered using the proposed methods fully agree with the theory.

DSAug 11, 2021

Learning strange attractors with reservoir systems

Lyudmila Grigoryeva, Allen Hart, Juan-Pablo Ortega

This paper shows that the celebrated Embedding Theorem of Takens is a particular case of a much more general statement according to which, randomly generated linear state-space representations of generic observations of an invertible dynamical system carry in their wake an embedding of the phase space dynamics into the chosen Euclidean state space. This embedding coincides with a natural generalized synchronization that arises in this setup and that yields a topological conjugacy between the state-space dynamics driven by the generic observations of the dynamical system and the dynamical system itself. This result provides additional tools for the representation, learning, and analysis of chaotic attractors and sheds additional light on the reservoir computing phenomenon that appears in the context of recurrent neural networks.

NEOct 22, 2020

Fading memory echo state networks are universal

Lukas Gonon, Juan-Pablo Ortega

Echo state networks (ESNs) have been recently proved to be universal approximants for input/output systems with respect to various $L ^p$-type criteria. When $1\leq p< \infty$, only $p$-integrability hypotheses need to be imposed, while in the case $p=\infty$ a uniform boundedness hypotheses on the inputs is required. This note shows that, in the last case, a universal family of ESNs can be constructed that contains exclusively elements that have the echo state and the fading memory properties. This conclusion could not be drawn with the results and methods available so far in the literature.

NESep 17, 2020

Discrete-time signatures and randomness in reservoir computing

Christa Cuchiero, Lukas Gonon, Lyudmila Grigoryeva et al.

A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projections of a family of state-space systems that generate Volterra series expansions. This procedure yields a state-affine reservoir system with randomly generated coefficients in a dimension that is logarithmically reduced with respect to the original system. This reservoir system is able to approximate any element in the fading memory filters class just by training a different linear readout for each different filter. Explicit expressions for the probability distributions needed in the generation of the projected reservoir system are stated and bounds for the committed approximation error are provided.

OCJul 23, 2020

Dimension reduction in recurrent networks by canonicalization

Lyudmila Grigoryeva, Juan-Pablo Ortega

Many recurrent neural network machine learning paradigms can be formulated using state-space representations. The classical notion of canonical state-space realization is adapted in this paper to accommodate semi-infinite inputs so that it can be used as a dimension reduction tool in the recurrent networks setup. The so-called input forgetting property is identified as the key hypothesis that guarantees the existence and uniqueness (up to system isomorphisms) of canonical realizations for causal and time-invariant input/output systems with semi-infinite inputs. Additionally, the notion of optimal reduction coming from the theory of symmetric Hamiltonian systems is implemented in our setup to construct canonical realizations out of input forgetting but not necessarily canonical ones. These two procedures are studied in detail in the framework of linear fading memory input/output systems. Finally, the notion of implicit reduction using reproducing kernel Hilbert spaces (RKHS) is introduced which allows, for systems with linear readouts, to achieve dimension reduction without the need to actually compute the reduced spaces introduced in the first part of the paper.

OCApr 22, 2020

Memory and forecasting capacities of nonlinear recurrent networks

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

The notion of memory capacity, originally introduced for echo state and linear networks with independent inputs, is generalized to nonlinear recurrent networks with stationary but dependent inputs. The presence of dependence in the inputs makes natural the introduction of the network forecasting capacity, that measures the possibility of forecasting time series values using network states. Generic bounds for memory and forecasting capacities are formulated in terms of the number of neurons of the nonlinear recurrent network and the autocovariance function or the spectral density of the input. These bounds generalize well-known estimates in the literature to a dependent inputs setup. Finally, for the particular case of linear recurrent networks with independent inputs it is proved that the memory capacity is given by the rank of the associated controllability matrix, a fact that has been for a long time assumed to be true without proof by the community.

PRFeb 14, 2020

Approximation Bounds for Random Neural Networks and Reservoir Systems

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

This work studies approximation based on single-hidden-layer feedforward and recurrent neural networks with randomly generated internal weights. These methods, in which only the last layer of weights and a few hyperparameters are optimized, have been successfully applied in a wide range of static and dynamic learning problems. Despite the popularity of this approach in empirical tasks, important theoretical questions regarding the relation between the unknown function, the weight distribution, and the approximation rate have remained open. In this work it is proved that, as long as the unknown function, functional, or dynamical system is sufficiently regular, it is possible to draw the internal weights of the random (recurrent) neural network from a generic distribution (not depending on the unknown object) and quantify the error in terms of the number of neurons and the hyperparameters. In particular, this proves that echo state networks with randomly generated weights are capable of approximating a wide class of dynamical systems arbitrarily well and thus provides the first mathematical explanation for their empirically observed success at learning dynamical systems.

LGOct 30, 2019

Risk bounds for reservoir computing

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the multivariate Rademacher complexities of the reservoir systems and the weak dependence structure of the signals that are being handled. This allows, in particular, to determine the minimal number of observations needed in order to guarantee a prescribed estimation accuracy with high probability for a given reservoir family. At the same time, the asymptotic behavior of the devised bounds guarantees the consistency of the empirical risk minimization procedure for various hypothesis classes of reservoir functionals.

NEFeb 16, 2019

Differentiable reservoir computing

Lyudmila Grigoryeva, Juan-Pablo Ortega

Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the differentiability of reservoir filters for very general classes of discrete-time deterministic inputs. This constitutes a novel strong contribution to the long line of research on the ESP and the FMP and, in particular, links to existing research on the input-dependence of the ESP. Differentiability has been shown in the literature to be a key feature in the learning of attractors of chaotic dynamical systems. A Volterra-type series representation for reservoir filters with semi-infinite discrete-time inputs is constructed in the analytic case using Taylor's theorem and corresponding approximation bounds are provided. Finally, it is shown as a corollary of these results that any fading memory filter can be uniformly approximated by a finite Volterra series with finite memory.

ETJul 7, 2018

Reservoir Computing Universality With Stochastic Inputs

Lukas Gonon, Juan-Pablo Ortega

The universal approximation properties with respect to $L ^p $-type criteria of three important families of reservoir computers with stochastic discrete-time semi-infinite inputs is shown. First, it is proved that linear reservoir systems with either polynomial or neural network readout maps are universal. More importantly, it is proved that the same property holds for two families with linear readouts, namely, trigonometric state-affine systems and echo state networks, which are the most widely used reservoir systems in applications. The linearity in the readouts is a key feature in supervised machine learning applications. It guarantees that these systems can be used in high-dimensional situations and in the presence of large datasets. The $L ^p $ criteria used in this paper allow the formulation of universality results that do not necessarily impose almost sure uniform boundedness in the inputs or the fading memory property in the filter that needs to be approximated.

NEJun 3, 2018

Echo state networks are universal

Lyudmila Grigoryeva, Juan-Pablo Ortega

This paper shows that echo state networks are universal uniform approximants in the context of discrete-time fading memory filters with uniformly bounded inputs defined on negative infinite times. This result guarantees that any fading memory input/output system in discrete time can be realized as a simple finite-dimensional neural network-type state-space model with a static linear readout map. This approximation is valid for infinite time intervals. The proof of this statement is based on fundamental results, also presented in this work, about the topological nature of the fading memory property and about reservoir computing systems generated by continuous reservoir maps.

NEDec 3, 2017

Universal discrete-time reservoir computers with stochastic inputs and linear readouts using non-homogeneous state-affine systems

Lyudmila Grigoryeva, Juan-Pablo Ortega

A new class of non-homogeneous state-affine systems is introduced for use in reservoir computing. Sufficient conditions are identified that guarantee first, that the associated reservoir computers with linear readouts are causal, time-invariant, and satisfy the fading memory property and second, that a subset of this class is universal in the category of fading memory filters with stochastic almost surely uniformly bounded inputs. This means that any discrete-time filter that satisfies the fading memory property with random inputs of that type can be uniformly approximated by elements in the non-homogeneous state-affine family.

MLMay 29, 2016

Singular ridge regression with homoscedastic residuals: generalization error with estimated parameters

Lyudmila Grigoryeva, Juan-Pablo Ortega

This paper characterizes the conditional distribution properties of the finite sample ridge regression estimator and uses that result to evaluate total regression and generalization errors that incorporate the inaccuracies committed at the time of parameter estimation. The paper provides explicit formulas for those errors. Unlike other classical references in this setup, our results take place in a fully singular setup that does not assume the existence of a solution for the non-regularized regression problem. In exchange, we invoke a conditional homoscedasticity hypothesis on the regularized regression residuals that is crucial in our developments.

NEOct 13, 2015

Nonlinear memory capacity of parallel time-delay reservoir computers in the processing of multidimensional signals

Lyudmila Grigoryeva, Julie Henriques, Laurent Larger et al.

This paper addresses the reservoir design problem in the context of delay-based reservoir computers for multidimensional input signals, parallel architectures, and real-time multitasking. First, an approximating reservoir model is presented in those frameworks that provides an explicit functional link between the reservoir parameters and architecture and its performance in the execution of a specific task. Second, the inference properties of the ridge regression estimator in the multivariate context is used to assess the impact of finite sample training on the decrease of the reservoir capacity. Finally, an empirical study is conducted that shows the adequacy of the theoretical results with the empirical performances exhibited by various reservoir architectures in the execution of several nonlinear tasks with multidimensional inputs. Our results confirm the robustness properties of the parallel reservoir architecture with respect to task misspecification and parameter choice that had already been documented in the literature.

ETAug 1, 2015

Quantitative evaluation of the performance of discrete-time reservoir computers in the forecasting, filtering, and reconstruction of stochastic stationary signals

Lyudmila Grigoryeva, Julie Henriques, Juan-Pablo Ortega

This paper extends the notion of information processing capacity for non-independent input signals in the context of reservoir computing (RC). The presence of input autocorrelation makes worthwhile the treatment of forecasting and filtering problems for which we explicitly compute this generalized capacity as a function of the reservoir parameter values using a streamlined model. The reservoir model leading to these developments is used to show that, whenever that approximation is valid, this computational paradigm satisfies the so called separation and fading memory properties that are usually associated with good information processing performances. We show that several standard memory, forecasting, and filtering problems that appear in the parametric stochastic time series context can be readily formulated and tackled via RC which, as we show, significantly outperforms standard techniques in some instances.