Continual Model Routing in Evolving Model Hubs
It addresses the problem of scaling model selection and updating routing mechanisms in rapidly growing AI model hubs for practitioners using mixture-of-experts systems.
The paper formalizes Continual Model Routing (CMR) for evolving model hubs and introduces CMRBench, a benchmark with over 2,000 models. The proposed CARvE method outperforms baselines in model, family, and domain-level accuracy.
AI model hubs provide access to a rapidly growing collection of powerful pre-trained models, enabling off-the-shelf mixture-of-experts systems with different routing strategies. However, this rapid growth poses two fundamental challenges: scaling model selection across thousands of experts and continually updating routing mechanisms as new models and tasks are introduced. In this paper, we formalise this setting as Continual Model Routing (CMR) and propose CMRBench, a new large-scale benchmark simulating realistic hub expansion and including over 2,000 candidate models. Finally, we introduce CARvE, a contrastive embedding approach for efficient continual model routing via checkpoint-based anchoring and structured replay. Extensive empirical results and ablations show that CARvE significantly outperforms zero-shot retrieval, fine-tuning, and adapter-merging baselines in model, family, and domain-level accuracy.