COSTMLJun 25, 2013

Constrained Optimization for a Subset of the Gaussian Parsimonious Clustering Models

arXiv:1306.5824v111 citations
Originality Incremental advance
AI Analysis

This work addresses a known bottleneck in model-based clustering for researchers and practitioners by providing an incremental improvement to the EM algorithm for Gaussian mixture models.

The paper tackles the problem of local maxima and singularities in Gaussian mixture model likelihood surfaces for clustering by constraining the smallest, largest, or both eigenvalues of component covariance matrices within a subset of the Gaussian Parsimonious Clustering Models (GPCM) family, using a re-parameterized eigenvalue decomposition and demonstrating results through experiments on simulated and real data.

The expectation-maximization (EM) algorithm is an iterative method for finding maximum likelihood estimates when data are incomplete or are treated as being incomplete. The EM algorithm and its variants are commonly used for parameter estimation in applications of mixture models for clustering and classification. This despite the fact that even the Gaussian mixture model likelihood surface contains many local maxima and is singularity riddled. Previous work has focused on circumventing this problem by constraining the smallest eigenvalue of the component covariance matrices. In this paper, we consider constraining the smallest eigenvalue, the largest eigenvalue, and both the smallest and largest within the family setting. Specifically, a subset of the GPCM family is considered for model-based clustering, where we use a re-parameterized version of the famous eigenvalue decomposition of the component covariance matrices. Our approach is illustrated using various experiments with simulated and real data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes