LGMLAug 29, 2018

Proximal boosting: aggregating weak learners to minimize non-differentiable losses

arXiv:1808.09670v43 citations
Originality Incremental advance
AI Analysis

This addresses a limitation in boosting methods for machine learning practitioners dealing with non-differentiable losses, though it appears incremental as an extension of existing proximal and boosting techniques.

The paper tackles the problem of minimizing non-differentiable losses in boosting by proposing proximal boosting, which builds on the proximal point algorithm, and shows improved convergence rate and prediction accuracy over gradient boosting in experiments.

Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm, when the empirical risk to minimize is not differentiable, in order to introduce a novel boosting approach, called proximal boosting. It comes with a companion algorithm inspired by [1] and called residual proximal boosting, which is aimed at better controlling the approximation error. Theoretical convergence is proved for these two procedures under different hypotheses on the empirical risk and advantages of leveraging proximal methods for boosting are illustrated by numerical experiments on simulated and real-world data. In particular, we exhibit a favorable comparison over gradient boosting regarding convergence rate and prediction accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes