LGApr 20, 2016

Embedded all relevant feature selection with Random Ferns

arXiv:1604.06133v15 citations

Originality Incremental advance

AI Analysis

This addresses feature selection for machine learning practitioners, but it is incremental as it builds on existing wrapper methods with computational improvements.

The paper tackles the problem of all relevant feature selection by incorporating it into the training process using implicitly generated shadow attributes, evaluated with a random ferns classifier. Results show effectiveness but limitations due to stochasticity, restricting it to small dimensions or as part of broader procedures.

Many machine learning methods can produce variable importance scores expressing the usability of each feature in context of the produced model; those scores on their own are yet not sufficient to generate feature selection, especially when an all relevant selection is required. Although there are wrapper methods aiming to solve this problem, they introduce a substantial increase in the required computational effort. In this paper I investigate an idea of incorporating all relevant selection within the training process by producing importance for implicitly generated shadows, attributes irrelevant by design. I propose and evaluate such a method in context of random ferns classifier. Experiment results confirm the effectiveness of such approach, although show that fully stochastic nature of random ferns limits its applicability either to small dimensions or as a part of a broader feature selection procedure.

View on arXiv PDF

Similar