Gaussian Mixture Estimation from Weighted Samples
This addresses a specific issue in statistical estimation for researchers and practitioners dealing with weighted data, though it appears incremental as it builds on standard Gaussian mixture estimators.
The paper tackles the problem of estimating Gaussian mixture parameters from weighted samples by proposing an expectation-maximization method that correctly incorporates weights, demonstrating that existing methods produce wrong estimates with counterexamples.
We consider estimating the parameters of a Gaussian mixture density with a given number of components best representing a given set of weighted samples. We adopt a density interpretation of the samples by viewing them as a discrete Dirac mixture density over a continuous domain with weighted components. Hence, Gaussian mixture fitting is viewed as density re-approximation. In order to speed up computation, an expectation-maximization method is proposed that properly considers not only the sample locations, but also the corresponding weights. It is shown that methods from literature do not treat the weights correctly, resulting in wrong estimates. This is demonstrated with simple counterexamples. The proposed method works in any number of dimensions with the same computational load as standard Gaussian mixture estimators for unweighted samples.