Optimizing Feature Selection for Binary Classification with Noisy Labels: A Genetic Algorithm Approach
This addresses feature selection in noisy label scenarios, a domain-specific problem, with incremental improvements.
The paper tackled feature selection for binary classification with noisy labels by proposing a genetic algorithm-based method, NMFS-GA, which improved accuracy and interpretability on synthetic and real-world datasets like ADNI for dementia prediction.
Feature selection in noisy label scenarios remains an understudied topic. We propose a novel genetic algorithm-based approach, the Noise-Aware Multi-Objective Feature Selection Genetic Algorithm (NMFS-GA), for selecting optimal feature subsets in binary classification with noisy labels. NMFS-GA offers a unified framework for selecting feature subsets that are both accurate and interpretable. We evaluate NMFS-GA on synthetic datasets with label noise, a Breast Cancer dataset enriched with noisy features, and a real-world ADNI dataset for dementia conversion prediction. Our results indicate that NMFS-GA can effectively select feature subsets that improve the accuracy and interpretability of binary classifiers in scenarios with noisy labels.