LGOct 29, 2021

Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion

Jean-Samuel Leboeuf, Frédéric LeBlanc, Mario Marchand

arXiv:2111.00062v13.11 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides improved theoretical guarantees for generalization in machine learning, particularly for VC classes, though it is incremental in nature.

The paper tackles the problem of deriving tighter generalization bounds for VC classes by introducing a hypergeometric tail inversion and optimizing the ghost sample trick, resulting in bounds that are nearly never vacuous and consistently tighter than existing ones for reasonable dataset sizes.

We significantly improve the generalization bounds for VC classes by using two main ideas. First, we consider the hypergeometric tail inversion to obtain a very tight non-uniform distribution-independent risk upper bound for VC classes. Second, we optimize the ghost sample trick to obtain a further non-negligible gain. These improvements are then used to derive a relative deviation bound, a multiclass margin bound, as well as a lower bound. Numerical comparisons show that the new bound is nearly never vacuous, and is tighter than other VC bounds for all reasonable data set sizes.

View on arXiv PDF Code

Similar