It's all In the (Exponential) Family: An Equivalence between Maximum Likelihood Estimation and Control Variates for Sketching Algorithms

Keegan Kang, Kerong Wang, Ding Zhang, Rameshwar Pratap, Bhisham Dev Verma, Benedict H. W. Wong

arXiv:2601.22378v21.7h-index: 6

Originality Incremental advance

AI Analysis

This work addresses the problem of improving efficiency and stability in sketching algorithms for machine learning practitioners, though it appears incremental by extending known equivalence results.

The paper proves that under certain exponential family conditions, an optimal control variate estimator achieves the same asymptotic variance as the maximum likelihood estimator, providing an Expectation-Maximization algorithm for the MLE. Experiments on the bivariate Normal distribution show the EM algorithm is faster and more numerically stable than other root-finding methods, with expected applicability to other distributions.

Maximum likelihood estimators (MLE) and control variate estimators (CVE) have been used in conjunction with known information across sketching algorithms and applications in machine learning. We prove that under certain conditions in an exponential family, an optimal CVE will achieve the same asymptotic variance as the MLE, giving an Expectation-Maximization (EM) algorithm for the MLE. Experiments show the EM algorithm is faster and numerically stable compared to other root finding algorithms for the MLE for the bivariate Normal distribution, and we expect this to hold across distributions satisfying these conditions. We show how the EM algorithm leads to reproducibility for algorithms using MLE / CVE, and demonstrate how the EM algorithm leads to finding the MLE when the CV weights are known.

View on arXiv PDF

Similar