Grégoire Ferré

2papers

2 Papers

39.4MLApr 20
Random Matrix Theory of Early-Stopped Gradient Flow: A Transient BBP Scenario

Florentin Coeurdoux, Grégoire Ferré, Jean-Philippe Bouchaud

Empirical studies of trained models often report a transient regime in which signal is detectable in a finite gradient descent time window before overfitting dominates. We provide an analytically tractable random-matrix model that reproduces this phenomenon for gradient flow in a linear teacher--student setting. In this framework, learning occurs when an isolated eigenvalue separates from a noisy bulk, before eventually disappearing in the overfitting regime. The key ingredient is anisotropy in the input covariance, which induces fast and slow directions in the learning dynamics. In a two-block covariance model, we derive the full time-dependent bulk spectrum of the symmetrized weight matrix through a $2\times 2$ Dyson equation, and we obtain an explicit outlier condition for a rank-one teacher via a rank-two determinant formula. This yields a transient Baik-Ben Arous-Péché (BBP) transition: depending on signal strength and covariance anisotropy, the teacher spike may never emerge, emerge and persist, or emerge only during an intermediate time interval before being reabsorbed into the bulk. We map the corresponding phase diagrams and validate the theory against finite-size simulations. Our results provide a minimal solvable mechanism for early stopping as a transient spectral effect driven by anisotropy and noise.

NAMay 2, 2019
Error estimates on ergodic properties of discretized Feynman-Kac semigroups

Grégoire Ferré, Gabriel Stoltz

We consider the numerical analysis of the time discretization of Feynman-Kac semigroups associated with diffusion processes. These semigroups naturally appear in several fields, such as large deviation theory, Diffusion Monte Carlo or non-linear filtering. We present errors estimates a la Talay-Tubaro on their invariant measures when the underlying continuous stochastic differential equation is discretized; as well as on the leading eigenvalue of the generator of the dynamics, which corresponds to the rate of creation of probability. This provides criteria to construct efficient integration schemes of Feynman-Kac dynamics, as well as a mathematical justification of numerical results already observed in the Diffusion Monte Carlo community. Our analysis is illustrated by numerical simulations.