LGOCSep 22, 2022

Boosting as Frank-Wolfe

arXiv:2209.10831v22 citationsh-index: 20
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck in boosting algorithms for machine learning practitioners, offering an incremental improvement by unifying existing methods into a more efficient framework.

The paper tackles the issue of boosting algorithms for soft margin optimization, where LPBoost is fast but has worst-case inefficiency, while ERLPBoost and C-ERLPBoost have better guarantees but high per-iteration cost. The result is a generic boosting scheme combining Frank-Wolfe with any secondary algorithm, achieving the same convergence guarantee as ERLPBoost and performing comparably to LPBoost in experiments.

Some boosting algorithms, such as LPBoost, ERLPBoost, and C-ERLPBoost, aim to solve the soft margin optimization problem with the $\ell_1$-norm regularization. LPBoost rapidly converges to an $ε$-approximate solution in practice, but it is known to take $Ω(m)$ iterations in the worst case, where $m$ is the sample size. On the other hand, ERLPBoost and C-ERLPBoost are guaranteed to converge to an $ε$-approximate solution in $O(\frac{1}{ε^2} \ln \frac{m}ν)$ iterations. However, the computation per iteration is very high compared to LPBoost. To address this issue, we propose a generic boosting scheme that combines the Frank-Wolfe algorithm and any secondary algorithm and switches one to the other iteratively. We show that the scheme retains the same convergence guarantee as ERLPBoost and C-ERLPBoost. One can incorporate any secondary algorithm to improve in practice. This scheme comes from a unified view of boosting algorithms for soft margin optimization. More specifically, we show that LPBoost, ERLPBoost, and C-ERLPBoost are instances of the Frank-Wolfe algorithm. In experiments on real datasets, one of the instances of our scheme exploits the better updates of the secondary algorithm and performs comparably with LPBoost.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes