$\ell_0$-Regularized Quadratic Surface Support Vector Machines
This work addresses overfitting and interpretability issues in quadratic SVMs for machine learning practitioners, but it is incremental as it builds on existing methods with a sparsity constraint.
The paper tackles overfitting and interpretability in kernel-free quadratic surface SVMs by introducing a sparse variant with an ℓ₀-norm constraint, and it demonstrates reduced overfitting while maintaining strong classification performance on real-world datasets.
Kernel-free quadratic surface support vector machines have recently gained traction due to their flexibility in modeling nonlinear decision boundaries without relying on kernel functions. However, the introduction of a full quadratic classifier significantly increases the number of model parameters, scaling quadratically with data dimensionality, which often leads to overfitting and makes interpretation difficult. To address these challenges, we propose a sparse variant of the QSVM by enforcing a cardinality constraint on the model parameters. While enhancing generalization and promoting sparsity, leveraging the $\ell_0$-norm inevitably incurs additional computational complexity. To tackle this, we develop a penalty decomposition algorithm capable of producing solutions that provably satisfy the first-order Lu-Zhang optimality conditions. Our approach accommodates both hinge and quadratic loss functions. In both cases, we demonstrate that the subproblems arising within the algorithm either admit closed-form solutions or can be solved efficiently through dual formulations, which contributes to the method's overall effectiveness. We also analyze the convergence behavior of the algorithm under both loss settings. Finally, we validate our approach on several real-world datasets, demonstrating its ability to reduce overfitting while maintaining strong classification performance. The complete implementation and experimental code are publicly available at https://github.com/raminzandvakili/L0-QSVM.