LGAIJun 27, 2025

CLoVE: Personalized Federated Learning through Clustering of Loss Vector Embeddings

arXiv:2506.22427v11 citationsh-index: 42
Originality Highly original
AI Analysis

This addresses the problem of personalized federated learning for clients with non-IID data distributions, offering a robust and simple method that is incremental over existing CFL approaches.

The paper tackles the challenge of identifying client clusters in Clustered Federated Learning (CFL) by proposing CLoVE, which uses loss vector embeddings to separate clients and optimize cluster-specific models, achieving highly accurate cluster recovery in a few rounds and state-of-the-art model accuracy in experiments.

We propose CLoVE (Clustering of Loss Vector Embeddings), a novel algorithm for Clustered Federated Learning (CFL). In CFL, clients are naturally grouped into clusters based on their data distribution. However, identifying these clusters is challenging, as client assignments are unknown. CLoVE utilizes client embeddings derived from model losses on client data, and leverages the insight that clients in the same cluster share similar loss values, while those in different clusters exhibit distinct loss patterns. Based on these embeddings, CLoVE is able to iteratively identify and separate clients from different clusters and optimize cluster-specific models through federated aggregation. Key advantages of CLoVE over existing CFL algorithms are (1) its simplicity, (2) its applicability to both supervised and unsupervised settings, and (3) the fact that it eliminates the need for near-optimal model initialization, which makes it more robust and better suited for real-world applications. We establish theoretical convergence bounds, showing that CLoVE can recover clusters accurately with high probability in a single round and converges exponentially fast to optimal models in a linear setting. Our comprehensive experiments comparing with a variety of both CFL and generic Personalized Federated Learning (PFL) algorithms on different types of datasets and an extensive array of non-IID settings demonstrate that CLoVE achieves highly accurate cluster recovery in just a few rounds of training, along with state-of-the-art model accuracy, across a variety of both supervised and unsupervised PFL tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes