A Sober Look at Neural Network Initializations
This work tackles the initialization phase in neural network training, which is crucial for optimization but often overlooked, offering a domain-specific improvement.
The paper addresses the understudied problem of neural network initialization by analyzing common strategies for ReLU-activated DNNs and proposes an alternative method, with large-scale experiments showing improved performance.
Initializing the weights and the biases is a key part of the training process of a neural network. Unlike the subsequent optimization phase, however, the initialization phase has gained only limited attention in the literature. In this paper we discuss some consequences of commonly used initialization strategies for vanilla DNNs with ReLU activations. Based on these insights we then develop an alternative initialization strategy. Finally, we present some large scale experiments assessing the quality of the new initialization strategy.