Optimal Weak to Strong Learning
This work settles the sample complexity for a foundational problem in machine learning, benefiting researchers in boosting and learning theory.
The authors tackled the problem of constructing a strong learner from a weak learner with minimal training data, presenting a new algorithm that uses less data than AdaBoost and other methods to achieve the same generalization bounds, with a sample complexity lower bound proving it is optimal.
The classic algorithm AdaBoost allows to convert a weak learner, that is an algorithm that produces a hypothesis which is slightly better than chance, into a strong learner, achieving arbitrarily high accuracy when given enough training data. We present a new algorithm that constructs a strong learner from a weak learner but uses less training data than AdaBoost and all other weak to strong learners to achieve the same generalization bounds. A sample complexity lower bound shows that our new algorithm uses the minimum possible amount of training data and is thus optimal. Hence, this work settles the sample complexity of the classic problem of constructing a strong learner from a weak learner.