Largest Eigenvalues of the Conjugate Kernel of Single-Layered Neural Networks
This provides theoretical insights into neural network behavior for researchers in machine learning and random matrix theory, but it is incremental as it extends known linear results to a nonlinear context.
The paper tackles the asymptotic distribution of the largest eigenvalues for a nonlinear random matrix ensemble modeling the conjugate kernel of single-layered neural networks, showing that the largest eigenvalue converges to the same limit as in linear ensembles and identifying a possible phase transition based on the activation function and matrix distributions.
This paper is concerned with the asymptotic distribution of the largest eigenvalues for some nonlinear random matrix ensemble stemming from the study of neural networks. More precisely we consider $M= \frac{1}{m} YY^\top$ with $Y=f(WX)$ where $W$ and $X$ are random rectangular matrices with i.i.d. centered entries. This models the data covariance matrix or the Conjugate Kernel of a single layered random Feed-Forward Neural Network. The function $f$ is applied entrywise and can be seen as the activation function of the neural network. We show that the largest eigenvalue has the same limit (in probability) as that of some well-known linear random matrix ensembles. In particular, we relate the asymptotic limit of the largest eigenvalue for the nonlinear model to that of an information-plus-noise random matrix, establishing a possible phase transition depending on the function $f$ and the distribution of $W$ and $X$. This may be of interest for applications to machine learning.