A Random-Matrix Criterion for Initializing Gated Recurrent Neural Networks

Tommaso Fioratti, Riccardo Marcaccioli, Francesco Casola

arXiv:2605.106503.9

Predicted impact top 95% in LG · last 90 daysOriginality Incremental advance

AI Analysis

Provides a practical initialization guideline for practitioners using gated RNNs in reservoir computing, though the result is incremental as it extends known theory.

The authors derive a simple criterion to estimate the critical weight variance for initializing gated recurrent neural networks, showing it closely tracks the gain at which a reservoir achieves peak performance on a chaotic forecasting task.

Proper weight initialization prior to training has historically been one of the key factors that helped kick off the deep learning revolution. Initialization is even more crucial in "reservoir computing", where the weights of a readout layer are learned linearly while the reservoir weights are fixed and largely determine the richness, stability and memory of the resulting dynamics. In the infinite-width limit it has been shown that meaningful initializations are those sitting at an effective critical point of the randomly initialized model. The phase transition is controlled by the weight variance $g^2$ and separates an ordered phase from a chaotic one where information progressively degrades. Here we derive a simple criterion to estimate the critical $g_c$ for a broad class of recurrent architectures and we show that it closely tracks the gain at which a gated-RNN reservoir achieves peak performance on a chaotic forecasting task. Finally, we argue that our criterion can serve as a design principle for future initialization schemes.

View on arXiv PDF

Similar