Local Steps Speed Up Local GD for Heterogeneous Distributed Logistic Regression
This work addresses the challenge of efficient distributed optimization for heterogeneous data in machine learning, representing an incremental improvement by providing a tighter convergence analysis for a specific method.
The paper tackles the problem of distributed logistic regression with heterogeneous data by analyzing Local Gradient Descent variants, showing convergence at a rate of O(1/KR) for K local steps and sufficient communication rounds R, which improves over prior guarantees of at least Ω(1/R) that did not demonstrate the benefit of local updates.
We analyze two variants of Local Gradient Descent applied to distributed logistic regression with heterogeneous, separable data and show convergence at the rate $O(1/KR)$ for $K$ local steps and sufficiently large $R$ communication rounds. In contrast, all existing convergence guarantees for Local GD applied to any problem are at least $Ω(1/R)$, meaning they fail to show the benefit of local updates. The key to our improved guarantee is showing progress on the logistic regression objective when using a large stepsize $η\gg 1/K$, whereas prior analysis depends on $η\leq 1/K$.