Dirichlet Process Parsimonious Mixtures for clustering
This work provides a Bayesian nonparametric approach for clustering, which is incremental as it extends existing parsimonious mixture models.
The authors tackled the problem of clustering with parsimonious Gaussian mixture models by proposing Dirichlet Process Parsimonious Mixtures (DPPM), a Bayesian nonparametric formulation that simultaneously infers model parameters, the optimal number of components, and mixture structure from data. Results from simulated and real datasets showed DPPM is an effective nonparametric alternative to parametric models.
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models.