LGOCDec 1, 2025

Beyond Scaffold: A Unified Spatio-Temporal Gradient Tracking Method

arXiv:2512.01732v1h-index: 3
Originality Incremental advance
AI Analysis

This work addresses communication efficiency and model accuracy issues in distributed and federated learning, representing an incremental improvement over existing methods like Scaffold.

The paper tackles the problem of model drift in distributed and federated learning due to data heterogeneity and local gradient noise by proposing ST-GT, a unified spatio-temporal gradient tracking algorithm. It achieves a linear convergence rate for strongly convex problems and reduces the topology-dependent noise term from σ² to σ²/τ, improving communication efficiency.

In distributed and federated learning algorithms, communication overhead is often reduced by performing multiple local updates between communication rounds. However, due to data heterogeneity across nodes and the local gradient noise within each node, this strategy can lead to the drift of local models away from the global optimum. To address this issue, we revisit the well-known federated learning method Scaffold (Karimireddy et al., 2020) under a gradient tracking perspective, and propose a unified spatio-temporal gradient tracking algorithm, termed ST-GT, for distributed stochastic optimization over time-varying graphs. ST-GT tracks the global gradient across neighboring nodes to mitigate data heterogeneity, while maintaining a running average of local gradients to substantially suppress noise, with slightly more storage overhead. Without assuming bounded data heterogeneity, we prove that ST-GT attains a linear convergence rate for strongly convex problems and a sublinear rate for nonconvex cases. Notably, ST-GT achieves the first linear speed-up in communication complexity with respect to the number of local updates per round $τ$ for the strongly-convex setting. Compared to traditional gradient tracking methods, ST-GT reduces the topology-dependent noise term from $σ^2$ to $σ^2/τ$, where $σ^2$ denotes the noise level, thereby improving communication efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes