LGFeb 3, 2024

Rethinking the Starting Point: Collaborative Pre-Training for Federated Downstream Tasks

Yun-Wei Chu, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher G. Brinton

arXiv:2402.02225v44.61 citationsh-index: 22AAAI

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in federated learning by providing a more robust initialization method, which is incremental but offers practical gains for FL applications.

The paper tackles the problem of pre-trained models not generalizing well to arbitrary federated learning tasks, particularly with unseen labels and high variance across clients, by proposing CoPreFL, a collaborative pre-training approach using MAML that improves average accuracy and reduces variance.

A few recent studies have demonstrated that leveraging centrally pre-trained models can offer advantageous initializations for federated learning (FL). However, existing pre-training methods do not generalize well when faced with an arbitrary set of downstream FL tasks. Specifically, they often (i) achieve limited average accuracy, particularly when there are unseen downstream labels, and (ii) result in significant accuracy variance, failing to provide a balanced performance across clients. To address these challenges, we propose CoPreFL, a collaborative/distributed pre-training approach which provides a robust initialization for downstream FL tasks. The key idea of CoPreFL is a model-agnostic meta-learning (MAML) procedure that tailors the global model to closely mimic heterogeneous and unseen FL scenarios, resulting in a pre-trained model that is rapidly adaptable to arbitrary FL tasks. Our MAML procedure incorporates performance variance into the meta-objective function, balancing performance across clients rather than solely optimizing for accuracy. Through extensive experiments, we demonstrate that CoPreFL obtains significant improvements in both average accuracy and variance across arbitrary downstream FL tasks with unseen/seen labels, compared with various pre-training baselines. We also show how CoPreFL is compatible with different well-known FL algorithms applied by the downstream tasks, enhancing performance in each case.

View on arXiv PDF

Similar