Instance Based Approximations to Profile Maximum Likelihood
This work addresses computational bottlenecks in statistical estimation of symmetric properties, with incremental improvements for specific instance types.
The paper tackles the problem of efficiently computing approximate profile maximum likelihood (PML) distributions for symmetric property estimation, achieving results that match previous best algorithms and improve when instances have few distinct frequencies. It also provides the first provably efficient implementation of PseudoPML and practical estimators for distributions with small profile entropy.
In this paper we provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation. We provide an algorithm which matches the previous best known efficient algorithms for computing approximate PML distributions and improves when the number of distinct observed frequencies in the given instance is small. We achieve this result by exploiting new sparsity structure in approximate PML distributions and providing a new matrix rounding algorithm, of independent interest. Leveraging this result, we obtain the first provable computationally efficient implementation of PseudoPML, a general framework for estimating a broad class of symmetric properties. Additionally, we obtain efficient PML-based estimators for distributions with small profile entropy, a natural instance-based complexity measure. Further, we provide a simpler and more practical PseudoPML implementation that matches the best-known theoretical guarantees of such an estimator and evaluate this method empirically.