An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
This addresses a key bottleneck in GMM clustering for data analysts, though it appears incremental as it builds on existing Bayesian frameworks.
The paper tackles the problem of model selection for Gaussian Mixture Models (GMMs) to determine the number of clusters, proposing a new Bayesian algorithm that reconstructs the density of model order more quickly than Monte Carlo simulations.
In order to cluster or partition data, we often use Expectation-and-Maximization (EM) or Variational approximation with a Gaussian Mixture Model (GMM), which is a parametric probability density function represented as a weighted sum of $\hat{K}$ Gaussian component densities. However, model selection to find underlying $\hat{K}$ is one of the key concerns in GMM clustering, since we can obtain the desired clusters only when $\hat{K}$ is known. In this paper, we propose a new model selection algorithm to explore $\hat{K}$ in a Bayesian framework. The proposed algorithm builds the density of the model order which any information criterions such as AIC and BIC basically fail to reconstruct. In addition, this algorithm reconstructs the density quickly as compared to the time-consuming Monte Carlo simulation.