A large sample theory for infinitesimal gradient boosting
This work provides theoretical foundations for gradient boosting methods, which is incremental but important for understanding algorithm behavior in machine learning.
The paper tackles the asymptotic behavior of infinitesimal gradient boosting in the large sample limit, proving convergence to a deterministic process and showing that this population limit reduces test error over time.
Infinitesimal gradient boosting (Dombry and Duchamps, 2021) is defined as the vanishing-learning-rate limit of the popular tree-based gradient boosting algorithm from machine learning. It is characterized as the solution of a nonlinear ordinary differential equation in a infinite-dimensional function space where the infinitesimal boosting operator driving the dynamics depends on the training sample. We consider the asymptotic behavior of the model in the large sample limit and prove its convergence to a deterministic process. This population limit is again characterized by a differential equation that depends on the population distribution. We explore some properties of this population limit: we prove that the dynamics makes the test error decrease and we consider its long time behavior.