Optimal high-dimensional and nonparametric distributed testing under communication constraints
This work addresses communication-efficient statistical testing for distributed data, with foundational implications for distributed inference, though it is incremental in extending prior estimation results to testing.
The paper tackles the problem of distributed hypothesis testing under communication constraints, deriving minimax error bounds and algorithms for both high-dimensional and nonparametric settings, showing that consistent testing is possible with as little as 1-bit communication and outperforms single-machine tests.
We derive minimax testing errors in a distributed framework where the data is split over multiple machines and their communication to a central machine is limited to $b$ bits. We investigate both the $d$- and infinite-dimensional signal detection problem under Gaussian white noise. We also derive distributed testing algorithms reaching the theoretical lower bounds. Our results show that distributed testing is subject to fundamentally different phenomena that are not observed in distributed estimation. Among our findings, we show that testing protocols that have access to shared randomness can perform strictly better in some regimes than those that do not. We also observe that consistent nonparametric distributed testing is always possible, even with as little as $1$-bit of communication and the corresponding test outperforms the best local test using only the information available at a single local machine. Furthermore, we also derive adaptive nonparametric distributed testing strategies and the corresponding theoretical lower bounds.