Dense neural networks as sparse graphs and the lightning initialization
This work addresses a specific initialization problem in neural networks, offering an incremental improvement for training efficiency.
The paper tackles the suboptimal information flow in dense neural networks by proposing a lightning initialization that ensures complete information paths from input to output, resulting in faster accuracy increases for both pure dense and more complex networks.
Even though dense networks have lost importance today, they are still used as final logic elements. It could be shown that these dense networks can be simplified by the sparse graph interpretation. This in turn shows that the information flow between input and output is not optimal with an initialization common today. The lightning initialization sets the weights so that complete information paths exist between input and output from the start. It turned out that pure dense networks and also more complex networks with additional layers benefit from this initialization. The networks accuracy increases faster. The lightning initialization has two parameters which behaved robustly in the tests carried out. However, especially with more complex networks, an improvement effect only occurs at lower learning rates, which shows that the initialization retains its positive effect over the epochs with learning rate reduction.