LGAIOct 11, 2022

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

arXiv:2210.05212v13 citationsh-index: 12
Originality Highly original
AI Analysis

This addresses the training instability of RNNs by formally linking them to chaotic systems, offering insights for practitioners in deep learning, though it is incremental as it builds on prior gradient analysis.

The paper proves that randomly initialized recurrent neural networks (RNNs) exhibit Li-Yorke chaos with constant probability independent of width, explaining scrambling phenomena where nearby trajectories diverge unpredictably, and shows this chaotic behavior persists under small perturbations with exponential expressive power.

Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics, and their sensitivity to the initialization process often renders them notoriously hard to train. Recent works have shed light on such phenomena analyzing when exploding or vanishing gradients may occur, either of which is detrimental for training dynamics. In this paper, we point to a formal connection between RNNs and chaotic dynamical systems and prove a qualitatively stronger phenomenon about RNNs than what exploding gradients seem to suggest. Our main result proves that under standard initialization (e.g., He, Xavier etc.), RNNs will exhibit \textit{Li-Yorke chaos} with \textit{constant} probability \textit{independent} of the network's width. This explains the experimentally observed phenomenon of \textit{scrambling}, under which trajectories of nearby points may appear to be arbitrarily close during some timesteps, yet will be far away in future timesteps. In stark contrast to their feedforward counterparts, we show that chaotic behavior in RNNs is preserved under small perturbations and that their expressive power remains exponential in the number of feedback iterations. Our technical arguments rely on viewing RNNs as random walks under non-linear activations, and studying the existence of certain types of higher-order fixed points called \textit{periodic points} that lead to phase transitions from order to chaos.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes