Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
This work provides incremental theoretical insights into stochastic gradient dynamics for researchers in machine learning theory, specifically analyzing test risk in a controlled setting.
The paper tackles the test risk dynamics of stochastic gradient flow in learning theory, deriving a general formula for the difference between pure and stochastic gradient flows in the small learning rate regime and applying it to a weak features model to compute explicit corrections, with analytical results showing good agreement with simulations.
We investigate the test risk of continuous-time stochastic gradient flow dynamics in learning theory. Using a path integral formulation we provide, in the regime of a small learning rate, a general formula for computing the difference between test risk curves of pure gradient and stochastic gradient flows. We apply the general theory to a simple model of weak features, which displays the double descent phenomenon, and explicitly compute the corrections brought about by the added stochastic term in the dynamics, as a function of time and model parameters. The analytical results are compared to simulations of discrete-time stochastic gradient descent and show good agreement.