Andrii Trelin

2papers

2 Papers

LGJul 8, 2020
Binary Stochastic Filtering: feature selection and beyond

Andrii Trelin, Aleš Procházka

Feature selection is one of the most decisive tools in understanding data and machine learning models. Among other methods, sparsity induced by $L^{1}$ penalty is one of the simplest and best studied approaches to this problem. Although such regularization is frequently used in neural networks to achieve sparsity of weights or unit activations, it is unclear how it can be employed in the feature selection problem. This work aims at extending the neural network with ability to automatically select features by rethinking how the sparsity regularization can be used, namely, by stochastically penalizing feature involvement instead of the layer weights. The proposed method has demonstrated superior efficiency when compared to a few classical methods, achieved with minimal or no computational overhead, and can be directly applied to any existing architecture. Furthermore, the method is easily generalizable for neuron pruning and selection of regions of importance for spectral data.

LGFeb 12, 2019
Binary Stochastic Filtering: a Method for Neural Network Size Minimization and Supervised Feature Selection

Andrii Trelin, Ales Prochazka

Binary Stochastic Filtering (BSF), the algorithm for feature selection and neuron pruning is proposed in this work. The method defines filtering layer which penalizes amount of the information involved in the training process. This information could be the input data or output of the previous layer, which directly leads to the feature selection or neuron pruning respectively, producing \textit{ad hoc} subset of features or selecting optimal number of neurons in each layer. Filtering layer stochastically passes or drops features based on individual weights, which are tuned with standard backpropagation algorithm during the training process. Multifold decrease of neural network size has been achieved in the experiments. Besides, the method was able to select minimal number of features, surpassing literature references by the accuracy/dimensionality ratio.