Functional Central Limit Theorem for Stochastic Gradient Descent

arXiv:2602.15538v11.7h-index: 9

Originality Incremental advance

AI Analysis

This provides a theoretical foundation for understanding SGD dynamics, which is incremental but useful for researchers in optimization and machine learning.

The authors tackled the problem of characterizing the long-term fluctuations of stochastic gradient descent (SGD) trajectories around the minimizer for convex objectives, proving a functional central limit theorem that captures temporal structure and applies to non-smooth settings like robust location estimation.

We study the asymptotic shape of the trajectory of the stochastic gradient descent algorithm applied to a convex objective function. Under mild regularity assumptions, we prove a functional central limit theorem for the properly rescaled trajectory. Our result characterizes the long-term fluctuations of the algorithm around the minimizer by providing a diffusion limit for the trajectory. In contrast with classical central limit theorems for the last iterate or Polyak-Ruppert averages, this functional result captures the temporal structure of the fluctuations and applies to non-smooth settings such as robust location estimation, including the geometric median.

View on arXiv PDF

Similar