DSDCLGJul 11, 2023

$\ell_p$-Regression in the Arbitrary Partition Model of Communication

arXiv:2307.05117v13 citationsh-index: 58
Originality Highly original
AI Analysis

This work addresses communication efficiency in distributed machine learning for regression tasks, offering significant improvements over prior results, though it is incremental in advancing theoretical bounds.

The paper tackles the distributed ℓp-regression problem in the arbitrary partition model, providing improved communication complexity bounds: for p=2, an optimal bound of Θ̃(sd² + sd/ε) bits, and for p∈(1,2), an upper bound of Õ(sd²/ε + sd/poly(ε)), with linear dependence on 1/ε for large d, along with matching lower bounds for various p ranges.

We consider the randomized communication complexity of the distributed $\ell_p$-regression problem in the coordinator model, for $p\in (0,2]$. In this problem, there is a coordinator and $s$ servers. The $i$-th server receives $A^i\in\{-M, -M+1, \ldots, M\}^{n\times d}$ and $b^i\in\{-M, -M+1, \ldots, M\}^n$ and the coordinator would like to find a $(1+ε)$-approximate solution to $\min_{x\in\mathbb{R}^n} \|(\sum_i A^i)x - (\sum_i b^i)\|_p$. Here $M \leq \mathrm{poly}(nd)$ for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $\tildeΘ(sd^2 + sd/ε)$ bits. For $p \in (1,2)$,we obtain an $\tilde{O}(sd^2/ε+ sd/\mathrm{poly}(ε))$ upper bound. Notably, for $d$ sufficiently large, our leading order term only depends linearly on $1/ε$ rather than quadratically. We also show communication lower bounds of $Ω(sd^2 + sd/ε^2)$ for $p\in (0,1]$ and $Ω(sd^2 + sd/ε)$ for $p\in (1,2]$. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes