Inference under Information Constraints II: Communication Constraints and Shared Randomness
This work addresses communication-efficient inference for distributed systems, offering incremental improvements in protocols for specific statistical tasks.
The paper tackles the problem of distributed statistical inference, specifically distribution learning and identity testing, under communication constraints and examines the role of shared randomness. It proposes a simulate-and-infer strategy that is sample-optimal for distribution learning and a public-coin protocol that outperforms it for distribution testing, achieving sample-optimality.
A central server needs to perform statistical inference based on samples that are distributed over multiple users who can each send a message of limited length to the center. We study problems of distribution learning and identity testing in this distributed inference setting and examine the role of shared randomness as a resource. We propose a general-purpose simulate-and-infer strategy that uses only private-coin communication protocols and is sample-optimal for distribution learning. This general strategy turns out to be sample-optimal even for distribution testing among private-coin protocols. Interestingly, we propose a public-coin protocol that outperforms simulate-and-infer for distribution testing and is, in fact, sample-optimal. Underlying our public-coin protocol is a random hash that when applied to the samples minimally contracts the chi-squared distance of their distribution to the uniform distribution.