LGMar 11

Client-Conditional Federated Learning via Local Training Data Statistics

arXiv:2603.11307v16.2h-index: 1
Predicted impact top 83% in LG · last 90 daysOriginality Highly original
AI Analysis

This addresses the problem of data heterogeneity in federated learning for distributed systems, offering a novel and efficient solution that outperforms existing methods.

The paper tackles the challenge of federated learning under data heterogeneity by proposing a method that conditions a single global model on locally-computed PCA statistics of each client's training data, requiring no extra communication. It matches or surpasses an Oracle baseline across 97 configurations, achieving up to 6% improvement on combined heterogeneity and demonstrating robustness to data sparsity.

Federated learning (FL) under data heterogeneity remains challenging: existing methods either ignore client differences (FedAvg), require costly cluster discovery (IFCA), or maintain per-client models (Ditto). All degrade when data is sparse or heterogeneity is multi-dimensional. We propose conditioning a single global model on locally-computed PCA statistics of each client's training data, requiring zero additional communication. Evaluating across 97~configurations spanning four heterogeneity types (label shift, covariate shift, concept shift, and combined heterogeneity), four datasets (MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100), and seven FL baseline methods, we find that our method matches the Oracle baseline -- which knows true cluster assignments -- across all settings, surpasses it by 1--6% on combined heterogeneity where continuous statistics are richer than discrete cluster identifiers, and is uniquely sparsity-robust among all tested methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes