LGNov 28, 2024

Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures

Yicheng Zhang, Zhen Qin, Zhaomin Wu, Jian Hou, Shuiguang Deng

arXiv:2411.19128v411.59 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of inefficient federated fine-tuning for LLMs across diverse domains like healthcare and finance, offering a personalized solution that is incremental in optimizing model architectures.

The paper tackles the challenge of fine-tuning large language models (LLMs) in federated learning settings with heterogeneous client data by proposing FedAMoLE, a framework that uses data-driven heterogeneous model architectures, resulting in an average 5.97% improvement in client-side performance over existing methods.

Large language models (LLMs) are increasingly powering web-based applications, whose effectiveness relies on fine-tuning with large-scale instruction data. However, such data often contains valuable or sensitive information that limits its public sharing among business organizations. Federated learning (FL) enables collaborative fine-tuning of LLMs without accessing raw data. Existing approaches to federated LLM fine-tuning usually adopt a uniform model architecture, making it challenging to fit highly heterogeneous client-side data in varying domains and tasks, e.g., hospitals and financial institutions conducting federated fine-tuning may require different LLM architectures due to the distinct nature of their domains and tasks. To address this, we propose FedAMoLE, a lightweight personalized FL framework that enables data-driven heterogeneous model architectures. It features a heterogeneous mixture of low-rank adaptation (LoRA) experts module to aggregate architecturally heterogeneous models and a reverse selection-based expert assignment strategy to tailor model architectures for each client based on data distributions. Experiments across seven scenarios demonstrate that FedAMoLE improves client-side performance by an average of 5.97% over existing approaches while maintaining practical memory, communication, and computation overhead.

View on arXiv PDF Code

Similar