PRLGJul 11, 2024

Genus expansion for non-linear random matrix ensembles with applications to neural networks

arXiv:2407.08459v51 citationsh-index: 16
Originality Incremental advance
AI Analysis

This provides theoretical tools for understanding neural network behavior, particularly for researchers in machine learning theory, but is incremental as it builds on existing random matrix and neural network analysis methods.

The paper tackles the analysis of random neural networks at initialization by introducing a series expansion that linearizes activation functions, enabling the application of Wick's principle and genus expansion. It proves convergence to Gaussian processes, quantifies the rate of Neural Tangent Kernel convergence in Frobenius norm, and computes moments of the Jacobian's spectral distribution, extending results to sparse and non-Gaussian weights.

We present a unified approach to studying certain non-linear random matrix ensembles and associated random neural networks at initialization. This begins with a novel series expansion for neural networks which generalizes Faá di Bruno's formula to an arbitrary number of compositions. The role of monomials is played by random multilinear maps indexed by directed graphs, whose edges correspond to random matrices. Crucially, this expansion linearizes the effect of the activation functions, allowing for the direct application of Wick's principle and the genus expansion technique. As an application, we prove several results about neural networks with random weights. We first give a new proof of the fact that they converge to Gaussian processes as their width tends to infinity. Secondly, we quantify the rate of convergence of the Neural Tangent Kernel to its deterministic limit in Frobenius norm. Finally, we compute the moments of the limiting spectral distribution of the Jacobian (only the first two of which were previously known), expressing them as sums over non-crossing partitions. All of these results are then generalised to the case of neural networks with sparse and non-Gaussian weights, under moment assumptions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes