LG AI NESep 18, 2023

Noise-Augmented Boruta: The Neural Network Perturbation Infusion with Boruta Feature Selection

Hassan Gharoun, Navid Yazdanjoe, Mohammad Sadegh Khorshidi, Amir H. Gandomi

arXiv:2309.09694v13.81 citationsh-index: 15

Originality Synthesis-oriented

AI Analysis

This incremental improvement addresses feature selection for data scientists dealing with high-dimensional datasets.

The paper tackled the challenge of feature selection in high-dimensional data by enhancing the Boruta algorithm with noise injection into shadow features, resulting in improved performance over the classic Boruta method on four benchmark datasets.

With the surge in data generation, both vertically (i.e., volume of data) and horizontally (i.e., dimensionality), the burden of the curse of dimensionality has become increasingly palpable. Feature selection, a key facet of dimensionality reduction techniques, has advanced considerably to address this challenge. One such advancement is the Boruta feature selection algorithm, which successfully discerns meaningful features by contrasting them to their permutated counterparts known as shadow features. However, the significance of a feature is shaped more by the data's overall traits than by its intrinsic value, a sentiment echoed in the conventional Boruta algorithm where shadow features closely mimic the characteristics of the original ones. Building on this premise, this paper introduces an innovative approach to the Boruta feature selection algorithm by incorporating noise into the shadow variables. Drawing parallels from the perturbation analysis framework of artificial neural networks, this evolved version of the Boruta method is presented. Rigorous testing on four publicly available benchmark datasets revealed that this proposed technique outperforms the classic Boruta algorithm, underscoring its potential for enhanced, accurate feature selection.

View on arXiv PDF

Similar