LG AIJun 27, 2025

CLoVE: Personalized Federated Learning through Clustering of Loss Vector Embeddings

Randeep Bhatia, Nikos Papadis, Murali Kodialam, TV Lakshman, Sayak Chakrabarty

arXiv:2506.22427v17.11 citationsh-index: 42

Originality Highly original

AI Analysis

This addresses the problem of personalized federated learning for clients with non-IID data distributions, offering a robust and simple method that is incremental over existing CFL approaches.

The paper tackles the challenge of identifying client clusters in Clustered Federated Learning (CFL) by proposing CLoVE, which uses loss vector embeddings to separate clients and optimize cluster-specific models, achieving highly accurate cluster recovery in a few rounds and state-of-the-art model accuracy in experiments.

We propose CLoVE (Clustering of Loss Vector Embeddings), a novel algorithm for Clustered Federated Learning (CFL). In CFL, clients are naturally grouped into clusters based on their data distribution. However, identifying these clusters is challenging, as client assignments are unknown. CLoVE utilizes client embeddings derived from model losses on client data, and leverages the insight that clients in the same cluster share similar loss values, while those in different clusters exhibit distinct loss patterns. Based on these embeddings, CLoVE is able to iteratively identify and separate clients from different clusters and optimize cluster-specific models through federated aggregation. Key advantages of CLoVE over existing CFL algorithms are (1) its simplicity, (2) its applicability to both supervised and unsupervised settings, and (3) the fact that it eliminates the need for near-optimal model initialization, which makes it more robust and better suited for real-world applications. We establish theoretical convergence bounds, showing that CLoVE can recover clusters accurately with high probability in a single round and converges exponentially fast to optimal models in a linear setting. Our comprehensive experiments comparing with a variety of both CFL and generic Personalized Federated Learning (PFL) algorithms on different types of datasets and an extensive array of non-IID settings demonstrate that CLoVE achieves highly accurate cluster recovery in just a few rounds of training, along with state-of-the-art model accuracy, across a variety of both supervised and unsupervised PFL tasks.

View on arXiv PDF

Similar