MEMLMar 12, 2019

Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic Distributions

arXiv:1903.05054v2
AI Analysis

This addresses clustering challenges in high-dimensional real-world data, but it is incremental as it builds on existing mixture models with a new parameterization.

The authors tackled robust clustering of high-dimensional data with heavy-tailed or asymmetric clusters by proposing a sparse mixture of generalized hyperbolic distributions with a penalty term, developing an expectation-maximization algorithm, and validating it through simulations and real datasets.

Robust clustering of high-dimensional data is an important topic because clusters in real datasets are often heavy-tailed and/or asymmetric. Traditional approaches to model-based clustering often fail for high dimensional data, e.g., due to the number of free covariance parameters. A parametrization of the component scale matrices for the mixture of generalized hyperbolic distributions is proposed. This parameterization includes a penalty term in the likelihood. An analytically feasible expectation-maximization algorithm is developed by placing a gamma-lasso penalty constraining the concentration matrix. The proposed methodology is investigated through simulation studies and illustrated using two real datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes