LGMLJun 10, 2022

Hierarchical mixtures of Gaussians for combined dimensionality reduction and clustering

arXiv:2206.04841v2h-index: 44
AI Analysis

This provides a practical approach for high-dimensional clustering that preserves statistical rigor and interpretability, addressing a gap in embedding-based and variational methods.

The authors tackled the problem of combining dimensionality reduction and clustering in high-dimensional data by introducing hierarchical mixtures of Gaussians (HMoGs), which unify these tasks into a single probabilistic model with closed-form expressions and exact inference, enabling efficient modeling of hundreds of latent dimensions and improved performance on datasets like MNIST.

We introduce hierarchical mixtures of Gaussians (HMoGs), which unify dimensionality reduction and clustering into a single probabilistic model. HMoGs provide closed-form expressions for the model likelihood, exact inference over latent states and cluster membership, and exact algorithms for maximum-likelihood optimization. The novel exponential family parameterization of HMoGs greatly reduces their computational complexity relative to similar model-based methods, allowing them to efficiently model hundreds of latent dimensions, and thereby capture additional structure in high-dimensional data. We demonstrate HMoGs on synthetic experiments and MNIST, and show how joint optimization of dimensionality reduction and clustering facilitates increased model performance. We also explore how sparsity-constrained dimensionality reduction can further improve clustering performance while encouraging interpretability. By bridging classical statistical modelling with the scale of modern data and compute, HMoGs offer a practical approach to high-dimensional clustering that preserves statistical rigour, interpretability, and uncertainty quantification that is often missing from embedding-based, variational, and self-supervised methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes