LGMar 22, 2022

On Supervised Feature Selection from High Dimensional Feature Spaces

Yijing Yang, Wei Wang, Hongyu Fu, C. -C. Jay Kuo

arXiv:2203.11924v317.374 citationsh-index: 19

Originality Incremental advance

AI Analysis

This work addresses feature selection for machine learning practitioners dealing with high-dimensional data, though it appears incremental as it builds on existing methods.

The authors tackled the problem of high-dimensional feature spaces in machine learning by proposing a novel supervised feature selection methodology, DFT and RFT, which effectively selected lower-dimensional subspaces while maintaining high decision performance, as demonstrated on datasets like MNIST and Fashion-MNIST.

The application of machine learning to image and video data often yields a high dimensional feature space. Effective feature selection techniques identify a discriminant feature subspace that lowers computational and modeling costs with little performance degradation. A novel supervised feature selection methodology is proposed for machine learning decisions in this work. The resulting tests are called the discriminant feature test (DFT) and the relevant feature test (RFT) for the classification and regression problems, respectively. The DFT and RFT procedures are described in detail. Furthermore, we compare the effectiveness of DFT and RFT with several classic feature selection methods. To this end, we use deep features obtained by LeNet-5 for MNIST and Fashion-MNIST datasets as illustrative examples. Other datasets with handcrafted and gene expressions features are also included for performance evaluation. It is shown by experimental results that DFT and RFT can select a lower dimensional feature subspace distinctly and robustly while maintaining high decision performance.

View on arXiv PDF

Similar