LG MLJan 11, 2022

Partial Model Averaging in Federated Learning: Performance Guarantees and Benefits

Sunwoo Lee, Anit Kumar Sahu, Chaoyang He, Salman Avestimehr

arXiv:2201.03789v19.624 citations

Originality Incremental advance

AI Analysis

This addresses a performance bottleneck in Federated Learning for distributed machine learning systems, offering an incremental improvement over existing methods.

The paper tackles the model discrepancy issue in Federated Learning caused by periodic full averaging in FedAvg, which slows global loss convergence. By proposing a partial model averaging framework, it achieves up to 2.2% higher validation accuracy with 128 workers compared to full averaging.

Local Stochastic Gradient Descent (SGD) with periodic model averaging (FedAvg) is a foundational algorithm in Federated Learning. The algorithm independently runs SGD on multiple workers and periodically averages the model across all the workers. When local SGD runs with many workers, however, the periodic averaging causes a significant model discrepancy across the workers making the global loss converge slowly. While recent advanced optimization methods tackle the issue focused on non-IID settings, there still exists the model discrepancy issue due to the underlying periodic model averaging. We propose a partial model averaging framework that mitigates the model discrepancy issue in Federated Learning. The partial averaging encourages the local models to stay close to each other on parameter space, and it enables to more effectively minimize the global loss. Given a fixed number of iterations and a large number of workers (128), the partial averaging achieves up to 2.2% higher validation accuracy than the periodic full averaging.

View on arXiv PDF

Similar