On the Initialisation of Wide Low-Rank Feedforward Neural Networks
This work addresses initialization challenges for practitioners using low-rank networks to reduce computational and memory costs, but it is incremental as it extends existing full-rank results to the low-rank setting.
The authors tackled the problem of initializing wide low-rank feedforward neural networks by analyzing edge-of-chaos dynamics, deriving optimal weight and bias variances and showing how the variance of the input-output Jacobian increases with lower rank-to-width ratios. This enables practitioners to reduce computational cost and memory constraints while maintaining performance in networks with fewer learnable parameters.
The edge-of-chaos dynamics of wide randomly initialized low-rank feedforward networks are analyzed. Formulae for the optimal weight and bias variances are extended from the full-rank to low-rank setting and are shown to follow from multiplicative scaling. The principle second order effect, the variance of the input-output Jacobian, is derived and shown to increase as the rank to width ratio decreases. These results inform practitioners how to randomly initialize feedforward networks with a reduced number of learnable parameters while in the same ambient dimension, allowing reductions in the computational cost and memory constraints of the associated network.