ML LGJan 31, 2017

Variable selection for clustering with Gaussian mixture models: state of the art

Abdelghafour Talibi, Boujemâa Achchab, Rafik Lasri

arXiv:1701.08946v12.613 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental review article that surveys state-of-the-art variable selection techniques for model-based clustering, targeting researchers and practitioners dealing with high-dimensional data.

The paper addresses the problem of variable selection in Gaussian mixture models for clustering, which is essential for handling large modern databases, and reviews existing methods while suggesting opportunities for improvement.

The mixture models have become widely used in clustering, given its probabilistic framework in which its based, however, for modern databases that are characterized by their large size, these models behave disappointingly in setting out the model, making essential the selection of relevant variables for this type of clustering. After recalling the basics of clustering based on a model, this article will examine the variable selection methods for model-based clustering, as well as presenting opportunities for improvement of these methods.

View on arXiv PDF

Similar