MLLGApr 12, 2018

Regularisation of Neural Networks by Enforcing Lipschitz Continuity

arXiv:1804.04368v3607 citations
Originality Incremental advance
AI Analysis

This addresses overfitting in neural networks for practitioners, offering an incremental improvement in regularization techniques.

The authors tackled the problem of neural network overfitting by explicitly enforcing Lipschitz continuity as a regularization method, resulting in models that outperformed those trained with other common regularizers, especially with limited training data.

We investigate the effect of explicitly enforcing the Lipschitz continuity of neural networks with respect to their inputs. To this end, we provide a simple technique for computing an upper bound to the Lipschitz constant---for multiple $p$-norms---of a feed forward neural network composed of commonly used layer types. Our technique is then used to formulate training a neural network with a bounded Lipschitz constant as a constrained optimisation problem that can be solved using projected stochastic gradient methods. Our evaluation study shows that the performance of the resulting models exceeds that of models trained with other common regularisers. We also provide evidence that the hyperparameters are intuitive to tune, demonstrate how the choice of norm for computing the Lipschitz constant impacts the resulting model, and show that the performance gains provided by our method are particularly noticeable when only a small amount of training data is available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes