Parallel coordinate descent for the Adaboost problem
This work addresses scalability issues in boosting algorithms for machine learning practitioners, though it is incremental as it adapts existing parallel coordinate descent methods to Adaboost.
The authors tackled the problem of scaling Adaboost to large datasets by developing a randomized parallel version based on coordinate descent, achieving competitive performance especially on large-scale learning problems with a theoretical parallelization speedup factor.
We design a randomised parallel version of Adaboost based on previous studies on parallel coordinate descent. The algorithm uses the fact that the logarithm of the exponential loss is a function with coordinate-wise Lipschitz continuous gradient, in order to define the step lengths. We provide the proof of convergence for this randomised Adaboost algorithm and a theoretical parallelisation speedup factor. We finally provide numerical examples on learning problems of various sizes that show that the algorithm is competitive with concurrent approaches, especially for large scale problems.