CVITMLNov 29, 2018

Simple stopping criteria for information theoretic feature selection

arXiv:1811.11971v215 citations
Originality Incremental advance
AI Analysis

This work addresses a specific optimization bottleneck in feature selection for machine learning practitioners, but it is incremental as it builds on existing information-theoretic approaches.

The paper tackles the problem of automatically determining when to stop greedy feature selection in information-theoretic methods by proposing two stopping criteria based on monitoring conditional mutual information among groups of variables, showing that these criteria are easy to implement and integrate into existing methods.

Feature selection aims to select the smallest feature subset that yields the minimum generalization error. In the rich literature in feature selection, information theory-based approaches seek a subset of features such that the mutual information between the selected features and the class labels is maximized. Despite the simplicity of this objective, there still remain several open problems in optimization. These include, for example, the automatic determination of the optimal subset size (i.e., the number of features) or a stopping criterion if the greedy searching strategy is adopted. In this paper, we suggest two stopping criteria by just monitoring the conditional mutual information (CMI) among groups of variables. Using the recently developed multivariate matrix-based Renyi's α-entropy functional, which can be directly estimated from data samples, we showed that the CMI among groups of variables can be easily computed without any decomposition or approximation, hence making our criteria easy to implement and seamlessly integrated into any existing information theoretic feature selection methods with a greedy search strategy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes