pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation
This work addresses the limitation of single pre-trained model prompt tuning for visual adaptation tasks by leveraging diverse domain knowledge, which is relevant for researchers and practitioners working on efficient model adaptation.
The paper proposes pMoE, a Mixture-of-Experts prompt tuning method that integrates diverse domain knowledge from multiple pre-trained models. This approach significantly enhances model versatility and achieves superior performance across 47 adaptation tasks in general and medical domains, offering an optimal trade-off between computational efficiency and adaptation effectiveness.
Parameter-efficient fine-tuning has demonstrated promising results across various visual adaptation tasks, such as classification and segmentation. Typically, prompt tuning techniques have harnessed knowledge from a single pre-trained model, whether from a general or a specialized medical domain. However, this approach typically overlooks the potential synergies that could arise from integrating diverse domain knowledge within the same tuning process. In this work, we propose a novel Mixture-of-Experts prompt tuning method called pMoE, which leverages the strengths of multiple expert domains through expert-specialized prompt tokens and the learnable dispatcher, effectively combining their expertise in a unified model framework. Our pMoE introduces expert-specific prompt tokens and utilizes a dynamic token dispatching mechanism at various prompt layers to optimize the contribution of each domain expert during the adaptation phase. By incorporating both domain knowledge from diverse experts, the proposed pMoE significantly enhances the model's versatility and applicability to a broad spectrum of tasks. We conduct extensive experiments across 47 adaptation tasks, including both classification and segmentation in general and medical domains. The results demonstrate that our pMoE not only achieves superior performance with a large margin of improvements but also offers an optimal trade-off between computational efficiency and adaptation effectiveness compared to existing methods.