LG FA MLMar 20, 2020

Sample Complexity Result for Multi-category Classifiers of Bounded Variation

arXiv:2003.09176v23.32 citations

Originality Incremental advance

AI Analysis

This work addresses theoretical guarantees for multi-class classification in machine learning, offering an incremental improvement in sample complexity bounds for bounded variation classifiers.

The paper tackles the problem of bounding the sample complexity for multi-category classifiers with bounded variation functions, deriving an estimate that depends on the number of classes C. The result improves the dependency from O(C^(d/2 +1)) to O(C ln^2(C)), providing a sharper bound.

We control the probability of the uniform deviation between empirical and generalization performances of multi-category classifiers by an empirical L1 -norm covering number when these performances are defined on the basis of the truncated hinge loss function. The only assumption made on the functions implemented by multi-category classifiers is that they are of bounded variation (BV). For such classifiers, we derive the sample size estimate sufficient for the mentioned performances to be close with high probability. Particularly, we are interested in the dependency of this estimate on the number C of classes. To this end, first, we upper bound the scale-sensitive version of the VC-dimension, the fat-shattering dimension of sets of BV functions defined on R^d which gives a O(1/epsilon^d ) as the scale epsilon goes to zero. Secondly, we provide a sharper decomposition result for the fat-shattering dimension in terms of C, which for sets of BV functions gives an improvement from O(C^(d/2 +1)) to O(Cln^2(C)). This improvement then propagates to the sample complexity estimate.

View on arXiv PDF

Similar