Estimating Mixture Distributions via Stochastic Mirror Descent

Mohammadreza Ahmadypour, Tara Javidi, Farinaz Koushanfar

arXiv:2605.2492926.4

Predicted impact top 59% in ML · last 90 daysOriginality Incremental advance

AI Analysis

For statisticians and machine learning practitioners, this provides a flexible and scalable framework for mixture estimation, though it is an incremental extension of existing optimization techniques.

The paper revisits distribution estimation via mixture models, framing it as stochastic convex optimization and proposing estimators based on stochastic mirror descent. The method scales efficiently with many candidate components and achieves near-optimal convergence rates in KL divergence and ℓ2-norm, with improved sample efficiency.

We revisit the classical problem of estimating an unknown distribution from its samples by fitting a mixture model that minimizes cross-entropy loss. Framing the task as a stochastic convex optimization problem over the space of $ M $-component mixture distributions, we propose a family of estimators derived from the stochastic mirror descent (SMD) algorithm. This optimization-based approach provides a principled and flexible framework that generalizes traditional estimators and proposes a variety of novel estimators through the choice of Bregman divergences. A key advantage of our method is that it scales efficiently with the number of candidate components $ f_i $; that is, one can employ a large set of basis distributions in the mixture model without incurring significant computational overhead. This enables richer approximations and improved estimation accuracy. Moreover, in the case of categorical distribution (discrete outcomes) our estimators do not require a strict lower bound, in other words our framework does not require the precise knowledge of the support of the distribution. We demonstrate that, under mild conditions, the proposed $ φ$-SMD estimators achieve near-optimal convergence rates in both Kullback-Leibler (KL) divergence and $ \ell_2 $-norm and offer practical benefits when computation is expensive. Our numerical analysis highlights improved performance guaranties over classical estimators, particularly in terms of sample efficiency and scalability.

View on arXiv PDF

Similar