MLLGMar 31, 2023

Per-Example Gradient Regularization Improves Learning Signals from Noisy Data

arXiv:2303.17940v19 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses the challenge of noisy data in deep learning, offering an incremental improvement over existing gradient regularization methods.

The paper tackles the problem of improving learning from noisy data by introducing per-example gradient regularization (PEGR), which reduces test error and enhances robustness against noise perturbations by suppressing noise memorization.

Gradient regularization, as described in \citet{barrett2021implicit}, is a highly effective technique for promoting flat minima during gradient descent. Empirical evidence suggests that this regularization technique can significantly enhance the robustness of deep learning models against noisy perturbations, while also reducing test error. In this paper, we explore the per-example gradient regularization (PEGR) and present a theoretical analysis that demonstrates its effectiveness in improving both test error and robustness against noise perturbations. Specifically, we adopt a signal-noise data model from \citet{cao2022benign} and show that PEGR can learn signals effectively while suppressing noise. In contrast, standard gradient descent struggles to distinguish the signal from the noise, leading to suboptimal generalization performance. Our analysis reveals that PEGR penalizes the variance of pattern learning, thus effectively suppressing the memorization of noises from the training data. These findings underscore the importance of variance control in deep learning training and offer useful insights for developing more effective training approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes