A bagging and importance sampling approach to Support Vector Machines
This work addresses computational efficiency for SVMs in large-scale data contexts, but it appears incremental as it builds on existing nearest neighbors ideas.
The paper tackles the problem of solving Support Vector Machines (SVMs) for large databases by proposing a bagging and importance sampling approach to achieve faster solutions without significant loss in prediction error, with performance evaluated on benchmark examples.
An importance sampling and bagging approach to solving the support vector machine (SVM) problem in the context of large databases is presented and evaluated. Our algorithm builds on the nearest neighbors ideas presented in Camelo at al. (2015). As in that reference, the goal of the present proposal is to achieve a faster solution of the SVM problem without a significance loss in the prediction error. The performance of the methodology is evaluated in benchmark examples and theoretical aspects of subsample methods are discussed.