LGMay 25, 2023

pFedSim: Similarity-Aware Model Aggregation Towards Personalized Federated Learning

Jiahao Tan, Yipeng Zhou, Gang Liu, Jessie Hui Wang, Shui Yu

arXiv:2305.15706v111.531 citations

Originality Incremental advance

AI Analysis

This work addresses data privacy and non-IID data issues in federated learning for distributed clients, offering an incremental improvement by integrating existing methods.

The paper tackles the challenge of data heterogeneity in federated learning by proposing pFedSim, a personalized federated learning algorithm that combines similarity-based aggregation and model decoupling to improve model accuracy with low overhead and privacy risks, achieving significantly better performance than baselines in experiments on real datasets.

The federated learning (FL) paradigm emerges to preserve data privacy during model training by only exposing clients' model parameters rather than original data. One of the biggest challenges in FL lies in the non-IID (not identical and independently distributed) data (a.k.a., data heterogeneity) distributed on clients. To address this challenge, various personalized FL (pFL) methods are proposed such as similarity-based aggregation and model decoupling. The former one aggregates models from clients of a similar data distribution. The later one decouples a neural network (NN) model into a feature extractor and a classifier. Personalization is captured by classifiers which are obtained by local training. To advance pFL, we propose a novel pFedSim (pFL based on model similarity) algorithm in this work by combining these two kinds of methods. More specifically, we decouple a NN model into a personalized feature extractor, obtained by aggregating models from similar clients, and a classifier, which is obtained by local training and used to estimate client similarity. Compared with the state-of-the-art baselines, the advantages of pFedSim include: 1) significantly improved model accuracy; 2) low communication and computation overhead; 3) a low risk of privacy leakage; 4) no requirement for any external public information. To demonstrate the superiority of pFedSim, extensive experiments are conducted on real datasets. The results validate the superb performance of our algorithm which can significantly outperform baselines under various heterogeneous data settings.

View on arXiv PDF

Similar