LGAug 9, 2012

Margin Distribution Controlled Boosting

arXiv:1208.1846v1
Originality Incremental advance
AI Analysis

This work addresses boosting algorithms for machine learning practitioners, offering an incremental improvement over existing methods.

The paper tackles the problem of improving boosting generalization by directly controlling the margin distribution, proposing MCBoost, which outperforms AdaBoost and other methods on UCI datasets.

Schapire's margin theory provides a theoretical explanation to the success of boosting-type methods and manifests that a good margin distribution (MD) of training samples is essential for generalization. However the statement that a MD is good is vague, consequently, many recently developed algorithms try to generate a MD in their goodness senses for boosting generalization. Unlike their indirect control over MD, in this paper, we propose an alternative boosting algorithm termed Margin distribution Controlled Boosting (MCBoost) which directly controls the MD by introducing and optimizing a key adjustable margin parameter. MCBoost's optimization implementation adopts the column generation technique to ensure fast convergence and small number of weak classifiers involved in the final MCBooster. We empirically demonstrate: 1) AdaBoost is actually also a MD controlled algorithm and its iteration number acts as a parameter controlling the distribution and 2) the generalization performance of MCBoost evaluated on UCI benchmark datasets is validated better than those of AdaBoost, L2Boost, LPBoost, AdaBoost-CG and MDBoost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes