LGAug 11, 2022

Regularizing Deep Neural Networks with Stochastic Estimators of Hessian Trace

arXiv:2208.05924v24 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the challenge of overfitting in deep learning for practitioners, though it is incremental as it builds on prior regularization techniques.

The paper tackles the problem of improving generalization in deep neural networks by introducing a regularization method that penalizes the Hessian trace, motivated by generalization error bounds, and shows it outperforms existing methods like Jacobian and Mixup in experiments.

In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes