Sharp Bounds for Generalized Uniformity Testing
This work addresses a theoretical problem in distribution testing for researchers, providing fundamental limits and efficient algorithms, but it is incremental as it builds on prior work in uniformity testing.
The paper tackles the problem of generalized uniformity testing for discrete probability distributions, establishing tight bounds on sample complexity with an optimal tester and matching lower bound, showing it scales as Θ(1/(ε^{4/3}||p||_3) + 1/(ε^2 ||p||_2)).
We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution $p$ over an {\em unknown} discrete domain $\mathbfΩ$, we want to distinguish, with probability at least $2/3$, between the case that $p$ is uniform on some {\em subset} of $\mathbfΩ$ versus $ε$-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is $Θ\left(1/(ε^{4/3}\|p\|_3) + 1/(ε^{2} \|p\|_2) \right)$.