LGJun 10, 2025

On the Stability of the Jacobian Matrix in Deep Neural Networks

arXiv:2506.08764v13 citationsh-index: 12
Originality Incremental advance
AI Analysis

This provides a theoretical foundation for initialization schemes in modern neural networks with structured randomness, addressing a stability issue for deep learning practitioners.

The paper tackled the problem of exploding or vanishing gradients in deep neural networks by establishing a general stability theorem for the Jacobian matrix, extending prior work to include sparsity and non-i.i.d. weights, with rigorous guarantees based on random matrix theory.

Deep neural networks are known to suffer from exploding or vanishing gradients as depth increases, a phenomenon closely tied to the spectral behavior of the input-output Jacobian. Prior work has identified critical initialization schemes that ensure Jacobian stability, but these analyses are typically restricted to fully connected networks with i.i.d. weights. In this work, we go significantly beyond these limitations: we establish a general stability theorem for deep neural networks that accommodates sparsity (such as that introduced by pruning) and non-i.i.d., weakly correlated weights (e.g. induced by training). Our results rely on recent advances in random matrix theory, and provide rigorous guarantees for spectral stability in a much broader class of network models. This extends the theoretical foundation for initialization schemes in modern neural networks with structured and dependent randomness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes