GradualDiff-Fed: A Federated Learning Specialized Framework for Large Language Model
This addresses the problem of high communication costs in FL for LLMs, enabling more efficient and scalable fine-tuning in privacy-preserving settings, though it is incremental as it builds on existing FL methods.
The paper tackles the challenge of fine-tuning large language models (LLMs) in federated learning (FL) by introducing GradualDiff-Fed, a framework that reduces communication costs by transmitting only weight differences, achieving performance comparable to centralized training while drastically cutting overhead.
The rapid proliferation of large language models (LLMs) has created an unprecedented demand for fine-tuning models for specialized domains, such as medical science. While federated learning (FL) offers a decentralized and privacy-preserving approach to collaboratively fine-tune LLMs without sharing raw data, it presents significant challenges, particularly in performance and managing large model sizes efficiently. In this paper, we introduce GradualDiff-Fed, an FL framework designed explicitly for LLMs, and their challenge of handling the high parameter size. GradualDiff-Fed reduces communication costs by transmitting only the difference of model weights rather than the entire model during training rounds. Such an approach significantly improves scalability and communication efficiency, making it more feasible to fine-tune LLMs across distributed clients without compromising performance. Our evaluation demonstrates that GradualDiff-Fed achieves performance on par with centralized training while drastically reducing communication overhead. These results highlight the potential of GradualDiff-Fed as an efficient solution for fine-tuning large models from distributed data in privacy-preserving settings without comprising performance.