DSDMLGMLNov 17, 2019

Testing Properties of Multiple Distributions with Few Samples

arXiv:1911.07324v14 citations
Originality Highly original
AI Analysis

This addresses a challenge in statistical hypothesis testing for scenarios with limited data per source, such as in distributed or high-dimensional settings.

The paper tackles the problem of testing properties like uniformity, identity, and closeness for multiple distributions with few samples per distribution, achieving sample optimal testers under an additional condition.

We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from $s$ distributions, $p_1, p_2, \ldots, p_s$, we design testers for the following problems: (1) Uniformity Testing: Testing whether all the $p_i$'s are uniform or $ε$-far from being uniform in $\ell_1$-distance (2) Identity Testing: Testing whether all the $p_i$'s are equal to an explicitly given distribution $q$ or $ε$-far from $q$ in $\ell_1$-distance, and (3) Closeness Testing: Testing whether all the $p_i$'s are equal to a distribution $q$ which we have sample access to, or $ε$-far from $q$ in $\ell_1$-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes