A General Framework for Symmetric Property Estimation
This work addresses a fundamental problem in statistics and machine learning for researchers and practitioners dealing with distribution estimation, though it appears incremental as it builds on existing PML methods.
The paper tackles the problem of estimating symmetric properties of distributions from i.i.d. samples by identifying easy and difficult regions, and shows that using approximate profile maximum likelihood (PML) distributions in the difficult region yields sample complexity optimality for many properties in a broader parameter regime than previous PML-based approaches, with more practical algorithms.
In this paper we provide a general framework for estimating symmetric properties of distributions from i.i.d. samples. For a broad class of symmetric properties we identify the easy region where empirical estimation works and the difficult region where more complex estimators are required. We show that by approximately computing the profile maximum likelihood (PML) distribution \cite{ADOS16} in this difficult region we obtain a symmetric property estimation framework that is sample complexity optimal for many properties in a broader parameter regime than previous universal estimation approaches based on PML. The resulting algorithms based on these pseudo PML distributions are also more practical.