Federated style aware transformer aggregation of representations
This addresses the challenge of biased predictions and poor generalization for clients with divergent data in federated learning, representing an incremental improvement over traditional methods.
The paper tackles the problem of personalization in federated learning under data heterogeneity and communication constraints by proposing FedSTAR, which disentangles client-specific style factors from shared content and uses Transformer-based attention for prototype aggregation, resulting in improved personalization and robustness without increased communication cost.
Personalized Federated Learning (PFL) faces persistent challenges, including domain heterogeneity from diverse client data, data imbalance due to skewed participation, and strict communication constraints. Traditional federated learning often lacks personalization, as a single global model cannot capture client-specific characteristics, leading to biased predictions and poor generalization, especially for clients with highly divergent data distributions. To address these issues, we propose FedSTAR, a style-aware federated learning framework that disentangles client-specific style factors from shared content representations. FedSTAR aggregates class-wise prototypes using a Transformer-based attention mechanism, allowing the server to adaptively weight client contributions while preserving personalization. Furthermore, by exchanging compact prototypes and style vectors instead of full model parameters, FedSTAR significantly reduces communication overhead. Experimental results demonstrate that combining content-style disentanglement with attention-driven prototype aggregation improves personalization and robustness in heterogeneous environments without increasing communication cost.