What Makes Local Updates Effective: The Role of Data Heterogeneity and Smoothness
This work provides theoretical insights for researchers and practitioners in distributed and federated learning, but it is incremental as it builds on existing analysis frameworks.
The thesis tackled the problem of understanding when local update algorithms like Local SGD outperform centralized methods in distributed and federated optimization, showing that bounded second-order heterogeneity is necessary and sufficient for this advantage and establishing tight convergence bounds in various settings.
This thesis contributes to the theoretical understanding of local update algorithms, especially Local SGD, in distributed and federated optimization under realistic models of data heterogeneity. A central focus is on the bounded second-order heterogeneity assumption, which is shown to be both necessary and sufficient for local updates to outperform centralized or mini-batch methods in convex and non-convex settings. The thesis establishes tight upper and lower bounds in several regimes for various local update algorithms and characterizes the min-max complexity of multiple problem classes. At its core is a fine-grained consensus-error-based analysis framework that yields sharper finite-time convergence bounds under third-order smoothness and relaxed heterogeneity assumptions. The thesis also extends to online federated learning, providing fundamental regret bounds under both first-order and bandit feedback. Together, these results clarify when and why local updates offer provable advantages, and the thesis serves as a self-contained guide for analyzing Local SGD in heterogeneous environments.