LGMLJun 17, 2020

Constraint-Based Regularization of Neural Networks

arXiv:2006.10114v210 citations
Originality Incremental advance
AI Analysis

This work addresses training stability and generalization for deep neural networks, but it appears incremental as it builds on existing Langevin dynamics with constraint-based modifications.

The authors tackled the problem of training deep neural networks by incorporating constraints into a stochastic gradient Langevin framework to control parameters and improve robustness and generalization. They demonstrated the method in image classification and natural language processing tasks, showing it reduces gradient issues and stabilizes training.

We propose a method for efficiently incorporating constraints into a stochastic gradient Langevin framework for the training of deep neural networks. Constraints allow direct control of the parameter space of the model. Appropriately designed, they reduce the vanishing/exploding gradient problem, control weight magnitudes and stabilize deep neural networks and thus improve the robustness of training algorithms and the generalization capabilities of the trained neural network. We present examples of constrained training methods motivated by orthogonality preservation for weight matrices and explicit weight normalizations. We describe the methods in the overdamped formulation of Langevin dynamics and the underdamped form, in which momenta help to improve sampling efficiency. The methods are explored in test examples in image classification and natural language processing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes