LGCOMLMay 16, 2020

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

arXiv:2005.08027v31 citations
AI Analysis

This addresses the critical first step in training neural networks, offering improved efficiency and accuracy for practitioners, though it is incremental as it builds on known initialization challenges.

The authors tackled the problem of initializing multi-layer feedforward neural networks by proposing a novel scheme based on Stein's identity, which uses eigenvectors of a cross-moment matrix to set weights, resulting in faster and more accurate training compared to existing methods.

Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the input's second-order score function and the response. The input data is then forward propagated to the next layer and such a procedure can be repeated until all the hidden layers are initialized. Finally, the weights for the output layer are initialized by generalized linear modeling. Such a proposed SteinGLM method is shown through extensive numerical results to be much faster and more accurate than other popular methods commonly used for training neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes