LGNEFeb 20, 2024

Compact NSGA-II for Multi-objective Feature Selection

arXiv:2402.12625v13 citationsh-index: 13SMC
Originality Incremental advance
AI Analysis

This work addresses feature selection for machine learning practitioners by offering a more memory-efficient evolutionary method, though it is incremental as it builds on NSGA-II.

The authors tackled the expensive multi-objective feature selection problem by proposing a binary Compact NSGA-II algorithm, which uses probability vectors to reduce memory and fitness evaluations, achieving more efficient performance than NSGA-II in hypervolume on five datasets with limited budgets.

Feature selection is an expensive challenging task in machine learning and data mining aimed at removing irrelevant and redundant features. This contributes to an improvement in classification accuracy, as well as the budget and memory requirements for classification, or any other post-processing task conducted after feature selection. In this regard, we define feature selection as a multi-objective binary optimization task with the objectives of maximizing classification accuracy and minimizing the number of selected features. In order to select optimal features, we have proposed a binary Compact NSGA-II (CNSGA-II) algorithm. Compactness represents the population as a probability distribution to enhance evolutionary algorithms not only to be more memory-efficient but also to reduce the number of fitness evaluations. Instead of holding two populations during the optimization process, our proposed method uses several Probability Vectors (PVs) to generate new individuals. Each PV efficiently explores a region of the search space to find non-dominated solutions instead of generating candidate solutions from a small population as is the common approach in most evolutionary algorithms. To the best of our knowledge, this is the first compact multi-objective algorithm proposed for feature selection. The reported results for expensive optimization cases with a limited budget on five datasets show that the CNSGA-II performs more efficiently than the well-known NSGA-II method in terms of the hypervolume (HV) performance metric requiring less memory. The proposed method and experimental results are explained and analyzed in detail.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes