Initialization of a Polyharmonic Cascade, Launch and Testing
This addresses the problem of training very deep networks for researchers and practitioners, though it appears incremental as it builds on prior work on polyharmonic cascades.
The paper tackles the challenge of training deep neural networks by proposing a universal initialization procedure for polyharmonic cascades, enabling stable training of up to 500 layers without skip connections and achieving competitive results such as 98.3% accuracy on MNIST and AUCs of approximately 0.885 on HIGGS and 0.963 on Epsilon.
This paper concludes a series of studies on the polyharmonic cascade, a deep machine learning architecture theoretically derived from indifference principles and the theory of random functions. A universal initialization procedure is proposed, based on symmetric constellations in the form of hyperoctahedra with a central point. This initialization not only ensures stable training of cascades with tens and hundreds of layers (up to 500 layers without skip connections), but also radically simplifies the computations. Scalability and robustness are demonstrated on MNIST (98.3% without convolutions or augmentations), HIGGS (AUC approximately 0.885 on 11M examples), and Epsilon (AUC approximately 0.963 with 2000 features). All linear algebra is reduced to 2D operations and is efficiently executed on GPUs. A public repository and an archived snapshot are provided for full reproducibility.