FedRepOpt: Gradient Re-parametrized Optimizers in Federated Learning
This addresses performance and convergence issues for deploying advanced models on edge devices in federated learning, but it is incremental as it builds on existing gradient re-parameterization methods.
The paper tackles the problem of training large models on edge devices in federated learning, where computational limitations lead to suboptimal gradient updates, and proposes FedRepOpt, which boosts performance by 16.7% and 11.4% and speeds up convergence by 11.7% and 57.4% compared to baseline models.
Federated Learning (FL) has emerged as a privacy-preserving method for training machine learning models in a distributed manner on edge devices. However, on-device models face inherent computational power and memory limitations, potentially resulting in constrained gradient updates. As the model's size increases, the frequency of gradient updates on edge devices decreases, ultimately leading to suboptimal training outcomes during any particular FL round. This limits the feasibility of deploying advanced and large-scale models on edge devices, hindering the potential for performance enhancements. To address this issue, we propose FedRepOpt, a gradient re-parameterized optimizer for FL. The gradient re-parameterized method allows training a simple local model with a similar performance as a complex model by modifying the optimizer's gradients according to a set of model-specific hyperparameters obtained from the complex models. In this work, we focus on VGG-style and Ghost-style models in the FL environment. Extensive experiments demonstrate that models using FedRepOpt obtain a significant boost in performance of 16.7% and 11.4% compared to the RepGhost-style and RepVGG-style networks, while also demonstrating a faster convergence time of 11.7% and 57.4% compared to their complex structure.