Large Scale, Large Margin Classification using Indefinite Similarity Measures
This addresses the problem of scalability and kernel restrictions in classification for machine learning practitioners, offering an incremental improvement over existing methods.
The paper tackles the limitations of kernelized SVMs by proposing a scalable method for using indefinite similarity measures in large margin classification, achieving competitive accuracy with better training and test time complexities, such as being over 5 times sparser and more accurate on CIFAR-10.
Despite the success of the popular kernelized support vector machines, they have two major limitations: they are restricted to Positive Semi-Definite (PSD) kernels, and their training complexity scales at least quadratically with the size of the data. Many natural measures of similarity between pairs of samples are not PSD e.g. invariant kernels, and those that are implicitly or explicitly defined by latent variable models. In this paper, we investigate scalable approaches for using indefinite similarity measures in large margin frameworks. In particular we show that a normalization of similarity to a subset of the data points constitutes a representation suitable for linear classifiers. The result is a classifier which is competitive to kernelized SVM in terms of accuracy, despite having better training and test time complexities. Experimental results demonstrate that on CIFAR-10 dataset, the model equipped with similarity measures invariant to rigid and non-rigid deformations, can be made more than 5 times sparser while being more accurate than kernelized SVM using RBF kernels.