LG AIApr 29, 2025

Token-Level Prompt Mixture with Parameter-Free Routing for Federated Domain Generalization

Shuai Gong, Chaoran Cui, Xiaolin Dong, Xiushan Nie, Lei Zhu, Xiaojun Chang

arXiv:2504.21063v14.11 citationsh-index: 6Has CodeIEEE Transactions on Image Processing

Originality Highly original

AI Analysis

This work addresses the challenge of learning globally generalizable models from decentralized, heterogeneous data in federated learning, offering a more efficient and specialized approach for domain generalization tasks.

The paper tackles the problem of performance degradation in federated domain generalization by proposing TRIP, a framework that uses token-level prompt mixtures with parameter-free routing, achieving optimal generalization results across four benchmarks with communication of only 1K parameters per round.

Federated domain generalization (FedDG) aims to learn a globally generalizable model from decentralized clients with heterogeneous data while preserving privacy. Recent studies have introduced prompt learning to adapt vision-language models (VLMs) in FedDG by learning a single global prompt. However, such a one-prompt-fits-all learning paradigm typically leads to performance degradation on personalized samples. Although the mixture of experts (MoE) offers a promising solution for specialization, existing MoE-based methods suffer from coarse image-level expert assignment and high communication costs from parameterized routers. To address these limitations, we propose TRIP, a Token-level prompt mixture with parameter-free routing framework for FedDG, which treats multiple prompts as distinct experts. Unlike existing image-level routing designs, TRIP assigns different tokens within an image to specific experts. To ensure communication efficiency, TRIP incorporates a parameter-free routing mechanism based on token clustering and optimal transport. The instance-specific prompt is then synthesized by aggregating experts, weighted by the number of tokens assigned to each. Additionally, TRIP develops an unbiased learning strategy for prompt experts, leveraging the VLM's zero-shot generalization capability. Extensive experiments across four benchmarks demonstrate that TRIP achieves optimal generalization results, with communication of only 1K parameters per round. Our code is available at https://github.com/GongShuai8210/TRIP.

View on arXiv PDF Code

Similar