ScoreStop: Gradient-based early stopping using functional score tests
For practitioners using gradient boosting, ScoreStop provides an interpretable, scale-invariant stopping rule that works with implicit or data-dependent losses, though it is incremental over existing loss-based approaches.
ScoreStop proposes a gradient-based early-stopping rule for gradient boosted decision trees that uses a functional score test to decide when to stop, avoiding the need for a patience parameter. It is competitive with loss-based methods in synthetic and real-data benchmarks.
Gradient boosted decision trees require a stopping rule to avoid overfitting. The standard rule monitors a validation loss and stops if the loss fails to improve for a fixed patience period. However, the patience parameter has no interpretable scale and validation losses can be noisy or implicitly defined by a user-specified gradient. We propose ScoreStop, a gradient-based early-stopping rule that casts the stopping decision at each iteration as a test of the null hypothesis that the current predictor is the population risk minimizer. We use a functional score test, computed on validation data, with a statistic that is scale-invariant in the update direction, with a known asymptotic distribution under the null. Because our test uses gradients rather than loss values, the same construction applies to implicit losses such as LambdaRank, and data-dependent losses such as Cox regression via influence functions. In synthetic experiments and real-data benchmarks, we show that ScoreStop is competitive with loss-based methods.