LGMLSep 17, 2018

Span error bound for weighted SVM with applications in hyperparameter selection

arXiv:1809.06124v11 citations
Originality Incremental advance
AI Analysis

This work provides incremental improvements for researchers and practitioners using weighted SVM by offering faster and more accurate hyperparameter selection tools.

The authors extended the span error bound theory to weighted SVM, enabling computationally efficient hyperparameter selection methods like span-rule, which outperformed alternatives such as K-fold cross-validation and the ξ-α bound on 14 benchmark datasets by being the most effective and best predictor of test error in mean square error.

Weighted SVM (or fuzzy SVM) is the most widely used SVM variant owning its effectiveness to the use of instance weights. Proper selection of the instance weights can lead to increased generalization performance. In this work, we extend the span error bound theory to weighted SVM and we introduce effective hyperparameter selection methods for the weighted SVM algorithm. The significance of the presented work is that enables the application of span bound and span-rule with weighted SVM. The span bound is an upper bound of the leave-one-out error that can be calculated using a single trained SVM model. This is important since leave-one-out error is an almost unbiased estimator of the test error. Similarly, the span-rule gives the actual value of the leave-one-out error. Thus, one can apply span bound and span-rule as computationally lightweight alternatives of leave-one-out procedure for hyperparameter selection. The main theoretical contributions are: (a) we prove the necessary and sufficient condition for the existence of the span of a support vector in weighted SVM; and (b) we prove the extension of span bound and span-rule to weighted SVM. We experimentally evaluate the span bound and the span-rule for hyperparameter selection and we compare them with other methods that are applicable to weighted SVM: the $K$-fold cross-validation and the $ξ-α$ bound. Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than $K$-fold cross-validation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes