LGJul 29, 2023

Multi-view Sparse Laplacian Eigenmaps for nonlinear Spectral Feature Selection

arXiv:2307.15905v12 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses feature selection challenges for machine learning practitioners dealing with high-dimensional datasets, though it appears incremental as it builds on existing graph-based and sparse methods.

The authors tackled the problem of high-dimensional data complexity by proposing Multi-view Sparse Laplacian Eigenmaps (MSLE) for feature selection, achieving a 2.72% error rate with SVM after reducing features by 90% and 96.69% accuracy with an 80% reduction on the UCI-HAR dataset.

The complexity of high-dimensional datasets presents significant challenges for machine learning models, including overfitting, computational complexity, and difficulties in interpreting results. To address these challenges, it is essential to identify an informative subset of features that captures the essential structure of the data. In this study, the authors propose Multi-view Sparse Laplacian Eigenmaps (MSLE) for feature selection, which effectively combines multiple views of the data, enforces sparsity constraints, and employs a scalable optimization algorithm to identify a subset of features that capture the fundamental data structure. MSLE is a graph-based approach that leverages multiple views of the data to construct a more robust and informative representation of high-dimensional data. The method applies sparse eigendecomposition to reduce the dimensionality of the data, yielding a reduced feature set. The optimization problem is solved using an iterative algorithm alternating between updating the sparse coefficients and the Laplacian graph matrix. The sparse coefficients are updated using a soft-thresholding operator, while the graph Laplacian matrix is updated using the normalized graph Laplacian. To evaluate the performance of the MSLE technique, the authors conducted experiments on the UCI-HAR dataset, which comprises 561 features, and reduced the feature space by 10 to 90%. Our results demonstrate that even after reducing the feature space by 90%, the Support Vector Machine (SVM) maintains an error rate of 2.72%. Moreover, the authors observe that the SVM exhibits an accuracy of 96.69% with an 80% reduction in the overall feature space.

View on arXiv PDF

Similar