LGOCMLFeb 6, 2023

U-Clip: On-Average Unbiased Stochastic Gradient Clipping

arXiv:2302.02971v12 citationsh-index: 45
Originality Incremental advance
AI Analysis

This addresses bias issues in gradient clipping for machine learning optimization, offering an incremental improvement over existing methods.

The paper tackles the bias introduced by gradient clipping in optimization algorithms by proposing U-Clip, a method that retains clipped gradient portions in a buffer for use in subsequent iterations, showing that it achieves on-average unbiased updates with bounded cumulative bias. Experimental validation on CIFAR10 and ImageNet demonstrates its effectiveness.

U-Clip is a simple amendment to gradient clipping that can be applied to any iterative gradient optimization algorithm. Like regular clipping, U-Clip involves using gradients that are clipped to a prescribed size (e.g. with component wise or norm based clipping) but instead of discarding the clipped portion of the gradient, U-Clip maintains a buffer of these values that is added to the gradients on the next iteration (before clipping). We show that the cumulative bias of the U-Clip updates is bounded by a constant. This implies that the clipped updates are unbiased on average. Convergence follows via a lemma that guarantees convergence with updates $u_i$ as long as $\sum_{i=1}^t (u_i - g_i) = o(t)$ where $g_i$ are the gradients. Extensive experimental exploration is performed on CIFAR10 with further validation given on ImageNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes