ML LGSep 6, 2022

Generalisation under gradient descent via deterministic PAC-Bayes

Eugenio Clerico, Tyler Farghly, George Deligiannidis, Benjamin Guedj, Arnaud Doucet

Oxford

arXiv:2209.02525v412.47 citationsh-index: 89

Originality Incremental advance

AI Analysis

This work provides a theoretical foundation for understanding generalization in deterministic optimization algorithms, which is incremental but addresses a known bottleneck in PAC-Bayesian theory.

The paper tackles the problem of establishing generalization bounds for deterministic gradient descent methods without requiring de-randomization, resulting in fully computable bounds that depend on the initial distribution density and Hessian of the training objective.

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.

View on arXiv PDF

Similar