Entropic alternatives to initialization

arXiv:2107.07757v21 citations
AI Analysis

This work addresses initialization challenges in deep learning, offering a novel regularization approach that could improve training stability and performance, though it appears incremental as it builds on existing entropic frameworks.

The paper tackles the problem of initialization in deep convolutional neural networks by proposing a scoping protocol using local entropic loss functions for regularization, which is strong early in training and fades away, achieving tunable model complexity as an alternative to standard methods.

Local entropic loss functions provide a versatile framework to define architecture-aware regularization procedures. Besides the possibility of being anisotropic in the synaptic space, the local entropic smoothening of the loss function can vary during training, thus yielding a tunable model complexity. A scoping protocol where the regularization is strong in the early-stage of the training and then fades progressively away constitutes an alternative to standard initialization procedures for deep convolutional neural networks, nonetheless, it has wider applicability. We analyze anisotropic, local entropic smoothenings in the language of statistical physics and information theory, providing insight into both their interpretation and workings. We comment some aspects related to the physics of renormalization and the spacetime structure of convolutional networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes