LG CVNov 26, 2022

Where to Pay Attention in Sparse Training for Feature Selection?

Ghada Sokar, Zahra Atashgahi, Mykola Pechenizkiy, Decebal Constantin Mocanu

arXiv:2211.14627v113.624 citationsh-index: 49Has Code

Originality Incremental advance

AI Analysis

This addresses the computational bottleneck in feature selection for high-dimensional datasets, though it appears to be an incremental improvement over existing sparse training methods.

The paper tackles the problem of slow convergence in neural network-based feature selection methods by proposing an attention-based sparse training algorithm for autoencoders that quickly identifies informative features. The approach outperforms state-of-the-art methods in feature selection while substantially reducing training iterations and computational costs across 10 diverse datasets.

A new line of research for feature selection based on neural networks has recently emerged. Despite its superiority to classical methods, it requires many training iterations to converge and detect informative features. The computational time becomes prohibitively long for datasets with a large number of samples or a very high dimensional feature space. In this paper, we present a new efficient unsupervised method for feature selection based on sparse autoencoders. In particular, we propose a new sparse training algorithm that optimizes a model's sparse topology during training to pay attention to informative features quickly. The attention-based adaptation of the sparse topology enables fast detection of informative features after a few training iterations. We performed extensive experiments on 10 datasets of different types, including image, speech, text, artificial, and biological. They cover a wide range of characteristics, such as low and high-dimensional feature spaces, and few and large training samples. Our proposed approach outperforms the state-of-the-art methods in terms of selecting informative features while reducing training iterations and computational costs substantially. Moreover, the experiments show the robustness of our method in extremely noisy environments.

View on arXiv PDF Code

Similar