LGOCMLMay 28, 2022

Feature subset selection for kernel SVM classification via mixed-integer optimization

arXiv:2205.14325v12 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses feature selection in nonlinear classification for machine learning practitioners, offering an incremental improvement over existing optimization-based methods.

The paper tackles feature subset selection for kernel SVM classification by proposing a mixed-integer linear optimization formulation based on kernel-target alignment, achieving good computational efficiency and outperforming linear-SVM-based methods in prediction performance, especially with few data instances.

We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. First proposed for linear regression in the 1970s, this approach has recently moved into the spotlight with advances in optimization algorithms and computer hardware. The goal of this paper is to establish an MIO approach for selecting the best subset of features for kernel SVM classification. To measure the performance of subset selection, we use the kernel-target alignment, which is the distance between the centroids of two response classes in a high-dimensional feature space. We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization software. We also derive a reduced version of the MILO problem to accelerate our MILO computations. Experimental results show good computational efficiency for our MILO formulation with the reduced problem. Moreover, our method can often outperform the linear-SVM-based MILO formulation and recursive feature elimination in prediction performance, especially when there are relatively few data instances.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes