LGAIJul 27, 2023

MVMR-FS : Non-parametric feature selection algorithm based on Maximum inter-class Variation and Minimum Redundancy

arXiv:2307.14643v11 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses a domain-specific problem in machine learning for feature selection, particularly for continuous data where redundancy measurement is difficult, offering an incremental improvement over existing methods.

The paper tackles the challenge of measuring feature relevance and redundancy in filter-based feature selection for continuous data, proposing MVMR-FS which uses kernel density estimation and an AGA to optimize feature subsets, achieving a 5% to 11% accuracy improvement over ten state-of-the-art methods.

How to accurately measure the relevance and redundancy of features is an age-old challenge in the field of feature selection. However, existing filter-based feature selection methods cannot directly measure redundancy for continuous data. In addition, most methods rely on manually specifying the number of features, which may introduce errors in the absence of expert knowledge. In this paper, we propose a non-parametric feature selection algorithm based on maximum inter-class variation and minimum redundancy, abbreviated as MVMR-FS. We first introduce supervised and unsupervised kernel density estimation on the features to capture their similarities and differences in inter-class and overall distributions. Subsequently, we present the criteria for maximum inter-class variation and minimum redundancy (MVMR), wherein the inter-class probability distributions are employed to reflect feature relevance and the distances between overall probability distributions are used to quantify redundancy. Finally, we employ an AGA to search for the feature subset that minimizes the MVMR. Compared with ten state-of-the-art methods, MVMR-FS achieves the highest average accuracy and improves the accuracy by 5% to 11%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes