Sparse and geometry-aware generalisation of the mutual information for joint discriminative clustering and feature selection
This addresses feature selection for clustering in high-dimensional data, but it is incremental as it builds on existing mutual information generalizations with a sparsity penalty.
The paper tackles the problem of feature selection in clustering by introducing Sparse GEMINI, a discriminative clustering model that maximizes a geometry-aware generalization of mutual information with an l1 penalty, avoiding combinatorial exploration and scaling to high-dimensional data. Results show it is competitive and selects relevant variable subsets without prior hypotheses.
Feature selection in clustering is a hard task which involves simultaneously the discovery of relevant clusters as well as relevant variables with respect to these clusters. While feature selection algorithms are often model-based through optimised model selection or strong assumptions on the data distribution, we introduce a discriminative clustering model trying to maximise a geometry-aware generalisation of the mutual information called GEMINI with a simple l1 penalty: the Sparse GEMINI. This algorithm avoids the burden of combinatorial feature subset exploration and is easily scalable to high-dimensional data and large amounts of samples while only designing a discriminative clustering model. We demonstrate the performances of Sparse GEMINI on synthetic datasets and large-scale datasets. Our results show that Sparse GEMINI is a competitive algorithm and has the ability to select relevant subsets of variables with respect to the clustering without using relevance criteria or prior hypotheses.