Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences
This work is significant for users and developers of LLMs in federated learning environments, as it enables personalized preference alignment without compromising privacy, addressing the limitation of monolithic reward models.
This paper addresses the challenge of aligning Large Language Models (LLMs) with diverse user preferences in a privacy-preserving federated learning setting, where existing methods average out conflicting preferences. The authors propose FedVPA-GP, which disentangles user preferences and allows dynamic preference switching, outperforming monolithic baselines on the HH-RLHF dataset.
Federated Learning (FL) offers a privacy-preserving pathway for aligning Large Language Models (LLMs); however, existing frameworks typically enforce a monolithic reward model, inevitably averaging out inherently conflicting user preferences (e.g., helpfulness vs. harmlessness). While Variational Preference Learning (VPL) offers a pathway to personalization, adapting it to decentralized settings presents a fundamental challenge: posterior collapse driven by severe local data scarcity and heterogeneity. In this paper, we propose Federated Variational Preference Alignment with Gumbel-Softmax Prior (FedVPA-GP), a framework designed to disentangle diverse preferences without compromising privacy. To stabilize variational inference, we introduce a Federated Mixture Prior that enables clients to leverage the aggregate population distribution as a dynamic prior. Furthermore, we incorporate an Orthogonal Loss that explicitly enforces the separation of preference prototypes in the latent space. Experiments on the HH-RLHF dataset demonstrate that FedVPA-GP significantly outperforms monolithic baselines, successfully disentangling conflicting user intents and enabling dynamic preference switching.