Multivariate response and parsimony for Gaussian cluster-weighted models
This work addresses clustering challenges in multivariate data analysis for statisticians and data scientists, offering an incremental improvement over prior models.
The authors tackled the problem of clustering with multivariate responses by extending Gaussian cluster-weighted models to account for correlations, using parsimonious constraints on covariance matrices. They demonstrated improved clustering performance over existing methods, with better handling of linear dependencies in regression structures.
A family of parsimonious Gaussian cluster-weighted models is presented. This family concerns a multivariate extension to cluster-weighted modelling that can account for correlations between multivariate responses. Parsimony is attained by constraining parts of an eigen-decomposition imposed on the component covariance matrices. A sufficient condition for identifiability is provided and an expectation-maximization algorithm is presented for parameter estimation. Model performance is investigated on both synthetic and classical real data sets and compared with some popular approaches. Finally, accounting for linear dependencies in the presence of a linear regression structure is shown to offer better performance, vis-à-vis clustering, over existing methodologies.