GPT-FL: Generative Pre-trained Model-Assisted Federated Learning
This addresses efficiency and performance issues in federated learning for applications with data privacy constraints, representing an incremental improvement.
The paper tackles the problem of improving federated learning by proposing GPT-FL, which uses generative pre-trained models to create synthetic data for server-side training, followed by fine-tuning with private client data. It consistently outperforms state-of-the-art FL methods in test accuracy, communication efficiency, and client sampling efficiency.
In this work, we propose GPT-FL, a generative pre-trained model-assisted federated learning (FL) framework. At its core, GPT-FL leverages generative pre-trained models to generate diversified synthetic data. These generated data are used to train a downstream model on the server, which is then fine-tuned with private client data under the standard FL framework. We show that GPT-FL consistently outperforms state-of-the-art FL methods in terms of model test accuracy, communication efficiency, and client sampling efficiency. Through comprehensive ablation analysis across various data modalities, we discover that the downstream model generated by synthetic data plays a crucial role in controlling the direction of gradient diversity during FL training, which enhances convergence speed and contributes to the notable accuracy boost observed with GPT-FL. Also, regardless of whether the target data falls within or outside the domain of the pre-trained generative model, GPT-FL consistently achieves significant performance gains, surpassing the results obtained by models trained solely with FL or synthetic data. The code is available at https://github.com/AvestimehrResearchGroup/GPT-FL.