MLSTAT-MECHLGApr 25, 2017

Spectral Ergodicity in Deep Learning Architectures via Surrogate Random Matrices

arXiv:1704.08303v37 citations
Originality Incremental advance
AI Analysis

This work addresses the optimization of deep and recurrent neural network architectures for researchers in machine learning, though it appears incremental as it builds on existing random matrix theory concepts.

The authors tackled the problem of quantifying spectral ergodicity in deep learning architectures by developing a novel method based on Thirumalai-Mountain and Kullback-Leibler divergence metrics, applied to random matrix ensembles mimicking neural network weights, and found that spectral ergodicity increases with matrix size, suggesting it may be key to network success.

In this work a novel method to quantify spectral ergodicity for random matrices is presented. The new methodology combines approaches rooted in the metrics of Thirumalai-Mountain (TM) and Kullbach-Leibler (KL) divergence. The method is applied to a general study of deep and recurrent neural networks via the analysis of random matrix ensembles mimicking typical weight matrices of those systems. In particular, we examine circular random matrix ensembles: circular unitary ensemble (CUE), circular orthogonal ensemble (COE), and circular symplectic ensemble (CSE). Eigenvalue spectra and spectral ergodicity are computed for those ensembles as a function of network size. It is observed that as the matrix size increases the level of spectral ergodicity of the ensemble rises, i.e., the eigenvalue spectra obtained for a single realisation at random from the ensemble is closer to the spectra obtained averaging over the whole ensemble. Based on previous results we conjecture that success of deep learning architectures is strongly bound to the concept of spectral ergodicity. The method to compute spectral ergodicity proposed in this work could be used to optimise the size and architecture of deep as well as recurrent neural networks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes