Stochastic optimization on matrices and a graphon McKean-Vlasov limit
This work provides a theoretical foundation for scaling optimization algorithms in graph-based machine learning, though it is incremental as it builds on prior graphon gradient flow results.
The paper tackles the problem of establishing deterministic limits for stochastic gradient descent on large symmetric matrices under permutation invariance, showing that under a small noise assumption, the limit corresponds to a gradient flow on graphons, and with scaled reflected Brownian noise, it extends to a McKean-Vlasov limit characterized by stochastic differential equations with reflections.
We consider stochastic gradient descents on the space of large symmetric matrices of suitable functions that are invariant under permuting the rows and columns using the same permutation. We establish deterministic limits of these random curves as the dimensions of the matrices go to infinity while the entries remain bounded. Under a ``small noise'' assumption the limit is shown to be the gradient flow of functions on graphons whose existence was established in Oh, Somani, Pal, and Tripathi, \texit{J Theor Probab 37, 1469--1522 (2024)}. We also consider limits of stochastic gradient descents with added properly scaled reflected Brownian noise. The limiting curve of graphons is characterized by a family of stochastic differential equations with reflections and can be thought of as an extension of the classical McKean-Vlasov limit for interacting diffusions to the graphon setting. The proofs introduce a family of infinite-dimensional exchangeable arrays of reflected diffusions and a novel notion of propagation of chaos for large matrices of diffusions converging to such arrays in a suitable sense.