OC LGMar 10

SHANG++: Robust Stochastic Acceleration under Multiplicative Noise

arXiv:2603.0935533.51 citationsh-index: 2

AI Analysis

This addresses robustness issues in stochastic optimization for machine learning practitioners, though it is incremental as it builds on existing accelerated methods.

The paper tackled the problem of Nesterov acceleration being sensitive to noise under multiplicative noise scaling, resulting in SHANG++ which improves stability and achieves faster convergence with stronger noise robustness, as shown by experiments where it attains accuracy within 1% of noise-free settings in ResNet-34.

Under the multiplicative noise scaling (MNS) condition, original Nesterov acceleration is provably sensitive to noise and may diverge when gradient noise overwhelms the signal. In this paper, we develop two accelerated stochastic gradient descent methods by discretizing the Hessian-driven Nesterov accelerated gradient flow. We first derive SHANG, a direct Gauss-Seidel-type discretization that already improves stability under MNS. We then introduce SHANG++, which adds a damping correction and achieves faster convergence with stronger noise robustness. We establish convergence guarantees for both convex and strongly convex objectives under MNS, together with explicit parameter choices. In our experiments, SHANG++ performs consistently well across convex problems and applications in deep learning. In a dedicated noise experiment on ResNet-34, a single hyperparameter configuration attains accuracy within 1% of the noise-free setting. Across all experiments, SHANG++ outperforms existing accelerated methods in robustness and efficiency, with minimal parameter sensitivity.

View on arXiv PDF

Similar