Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning
This addresses improved generalization and robustness in federated learning for edge devices, but it is incremental as it builds on existing sharpness-aware minimization approaches.
The paper tackles the problem of data heterogeneity in federated learning causing sharp minima, which harms generalization, by introducing FedGloSS, a method that optimizes global sharpness on the server using SAM without extra client communication, achieving flatter minima and better performance across benchmarks.
Federated learning (FL) enables collaborative model training with privacy preservation. Data heterogeneity across edge devices (clients) can cause models to converge to sharp minima, negatively impacting generalization and robustness. Recent approaches use client-side sharpness-aware minimization (SAM) to encourage flatter minima, but the discrepancy between local and global loss landscapes often undermines their effectiveness, as optimizing for local sharpness does not ensure global flatness. This work introduces FedGloSS (Federated Global Server-side Sharpness), a novel FL approach that prioritizes the optimization of global sharpness on the server, using SAM. To reduce communication overhead, FedGloSS cleverly approximates sharpness using the previous global gradient, eliminating the need for additional client communication. Our extensive evaluations demonstrate that FedGloSS consistently reaches flatter minima and better performance compared to state-of-the-art FL methods across various federated vision benchmarks.