A Large Dimensional Analysis of Least Squares Support Vector Machines
This work provides theoretical insights into SVM mechanisms and kernel choice impacts, which is incremental for researchers in machine learning theory and high-dimensional statistics.
The authors tackled the problem of understanding the performance of kernel least squares support vector machines (LS-SVMs) in high-dimensional settings by providing a large-dimensional analysis under a Gaussian mixture model, showing that the decision function approximates a normal distribution with explicit mean and variance dependent on the kernel function, and validated this on MNIST and Fashion-MNIST datasets with close agreement despite non-Gaussianity.
In this article, a large dimensional performance analysis of kernel least squares support vector machines (LS-SVMs) is provided under the assumption of a two-class Gaussian mixture model for the input data. Building upon recent advances in random matrix theory, we show, when the dimension of data $p$ and their number $n$ are both large, that the LS-SVM decision function can be well approximated by a normally distributed random variable, the mean and variance of which depend explicitly on a local behavior of the kernel function. This theoretical result is then applied to the MNIST and Fashion-MNIST datasets which, despite their non-Gaussianity, exhibit a convincingly close behavior. Most importantly, our analysis provides a deeper understanding of the mechanism into play in SVM-type methods and in particular of the impact on the choice of the kernel function as well as some of their theoretical limits in separating high dimensional Gaussian vectors.