MLLGJan 31, 2023

On the Initialisation of Wide Low-Rank Feedforward Neural Networks

arXiv:2301.13710v14 citationsh-index: 26
Originality Incremental advance
AI Analysis

This work addresses initialization challenges for practitioners using low-rank networks to reduce computational and memory costs, but it is incremental as it extends existing full-rank results to the low-rank setting.

The authors tackled the problem of initializing wide low-rank feedforward neural networks by analyzing edge-of-chaos dynamics, deriving optimal weight and bias variances and showing how the variance of the input-output Jacobian increases with lower rank-to-width ratios. This enables practitioners to reduce computational cost and memory constraints while maintaining performance in networks with fewer learnable parameters.

The edge-of-chaos dynamics of wide randomly initialized low-rank feedforward networks are analyzed. Formulae for the optimal weight and bias variances are extended from the full-rank to low-rank setting and are shown to follow from multiplicative scaling. The principle second order effect, the variance of the input-output Jacobian, is derived and shown to increase as the rank to width ratio decreases. These results inform practitioners how to randomly initialize feedforward networks with a reduced number of learnable parameters while in the same ambient dimension, allowing reductions in the computational cost and memory constraints of the associated network.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes