LGSYNov 19, 2022

Linear RNNs Provably Learn Linear Dynamic Systems

arXiv:2211.10582v2h-index: 6
Originality Highly original
AI Analysis

This provides foundational theoretical insights for researchers in machine learning and control theory, addressing a long-standing gap in understanding RNN learning dynamics, though it is incremental as it focuses on linear systems.

The paper tackles the problem of proving theoretical guarantees for linear recurrent neural networks (RNNs) to learn stable linear dynamic systems using gradient descent, showing that with sufficient width, they can learn any such system with polynomial sample and time complexity in terms of a stability parameter.

We study the learning ability of linear recurrent neural networks with Gradient Descent. We prove the first theoretical guarantee on linear RNNs to learn any stable linear dynamic system using any a large type of loss functions. For an arbitrary stable linear system with a parameter $ρ_C$ related to the transition matrix $C$, we show that despite the non-convexity of the parameter optimization loss if the width of the RNN is large enough (and the required width in hidden layers does not rely on the length of the input sequence), a linear RNN can provably learn any stable linear dynamic system with the sample and time complexity polynomial in $\frac{1}{1-ρ_C}$. Our results provide the first theoretical guarantee to learn a linear RNN and demonstrate how can the recurrent structure help to learn a dynamic system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes