LGMLMay 6, 2023

Learning Mixtures of Gaussians with Censored Data

arXiv:2305.04127v21 citations
Originality Highly original
AI Analysis

This addresses a classical statistical learning problem with practical applications, providing finite-sample guarantees for a latent variable model where such guarantees were previously missing.

The paper tackles the problem of learning mixtures of Gaussians from censored data, where samples are only observed if they lie within a specific set, and proposes an algorithm that requires only 1/ε^O(k) samples to estimate the weights and means within ε error.

We study the problem of learning mixtures of Gaussians with censored data. Statistical learning with censored data is a classical problem, with numerous practical applications, however, finite-sample guarantees for even simple latent variable models such as Gaussian mixtures are missing. Formally, we are given censored data from a mixture of univariate Gaussians $$ \sum_{i=1}^k w_i \mathcal{N}(μ_i,σ^2), $$ i.e. the sample is observed only if it lies inside a set $S$. The goal is to learn the weights $w_i$ and the means $μ_i$. We propose an algorithm that takes only $\frac{1}{\varepsilon^{O(k)}}$ samples to estimate the weights $w_i$ and the means $μ_i$ within $\varepsilon$ error.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes