LG FA PRMar 13, 2025

Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data

Dibyakanti Kumar, Samyak Jha, Anirbit Mukherjee

arXiv:2503.10428v37.11 citationsh-index: 6

Originality Highly original

AI Analysis

This provides a theoretical guarantee for scalable learning of neural networks, addressing a foundational challenge in machine learning, though it builds on recent observations about isoperimetry conditions.

The paper tackles the problem of learning depth-2 neural networks of any size and for any data using the Langevin Monte-Carlo algorithm, establishing non-asymptotic convergence rates by showing iterates converge to the Gibbs distribution with regularization independent of network size.

In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this via showing that in q-Renyi divergence, the iterates of Langevin Monte Carlo converge to the Gibbs distribution of Frobenius norm regularized losses for any of these nets, when using smooth activations and in both classification and regression settings. Most critically, the amount of regularization needed for our results is independent of the size of the net. This result achieves a synthesis of several recent observations about isoperimetry conditions under which LMC converges and that two-layer neural loss functions can always be regularized by a certain constant amount such that they satisfy the Villani conditions, and thus their Gibbs measures satisfy a Poincare inequality.

View on arXiv PDF

Similar