LGITSTJan 14, 2025

Distributed Nonparametric Estimation: from Sparse to Dense Samples per Terminal

arXiv:2501.07879v1h-index: 2ICML
Originality Highly original
AI Analysis

This work addresses a foundational communication-constrained estimation problem in distributed systems, providing a complete solution that bridges gaps in existing literature.

The paper tackles the problem of nonparametric function estimation under communication constraints in distributed settings, characterizing minimax optimal rates across all regimes from sparse to dense samples per terminal and identifying phase transitions in these rates. It fully resolves an open problem from prior works limited to specific regimes, with results applicable to cases like density estimation and regression models.

Consider the communication-constrained problem of nonparametric function estimation, in which each distributed terminal holds multiple i.i.d. samples. Under certain regularity assumptions, we characterize the minimax optimal rates for all regimes, and identify phase transitions of the optimal rates as the samples per terminal vary from sparse to dense. This fully solves the problem left open by previous works, whose scopes are limited to regimes with either dense samples or a single sample per terminal. To achieve the optimal rates, we design a layered estimation protocol by exploiting protocols for the parametric density estimation problem. We show the optimality of the protocol using information-theoretic methods and strong data processing inequalities, and incorporating the classic balls and bins model. The optimal rates are immediate for various special cases such as density estimation, Gaussian, binary, Poisson and heteroskedastic regression models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes