LG MLJun 17, 2020

Constraint-Based Regularization of Neural Networks

Benedict Leimkuhler, Timothée Pouchon, Tiffany Vlaar, Amos Storkey

arXiv:2006.10114v25.010 citations

Originality Incremental advance

AI Analysis

This work addresses training stability and generalization for deep neural networks, but it appears incremental as it builds on existing Langevin dynamics with constraint-based modifications.

The authors tackled the problem of training deep neural networks by incorporating constraints into a stochastic gradient Langevin framework to control parameters and improve robustness and generalization. They demonstrated the method in image classification and natural language processing tasks, showing it reduces gradient issues and stabilizes training.

We propose a method for efficiently incorporating constraints into a stochastic gradient Langevin framework for the training of deep neural networks. Constraints allow direct control of the parameter space of the model. Appropriately designed, they reduce the vanishing/exploding gradient problem, control weight magnitudes and stabilize deep neural networks and thus improve the robustness of training algorithms and the generalization capabilities of the trained neural network. We present examples of constrained training methods motivated by orthogonality preservation for weight matrices and explicit weight normalizations. We describe the methods in the overdamped formulation of Langevin dynamics and the underdamped form, in which momenta help to improve sampling efficiency. The methods are explored in test examples in image classification and natural language processing.

View on arXiv PDF

Similar