Implicit Modeling for Transferability Estimation of Vision Foundation Models
This work addresses a key bottleneck in deploying pre-trained models efficiently for practitioners, though it appears incremental as it builds on existing transferability estimation methods.
The paper tackles the problem of accurately estimating transferability for diverse pre-trained vision models to downstream tasks without full fine-tuning, proposing Implicit Transferability Modeling (ITM) with a Divide-and-Conquer Variational Approximation strategy, which outperforms existing methods in stability, effectiveness, and efficiency on comprehensive benchmarks.
Transferability estimation identifies the best pre-trained models for downstream tasks without incurring the high computational cost of full fine-tuning. This capability facilitates deployment and advances the pre-training and fine-tuning paradigm. However, existing methods often struggle to accurately assess transferability for emerging pre-trained models with diverse architectures, training strategies, and task alignments. In this work, we propose Implicit Transferability Modeling (ITM), a novel framework that implicitly models each model's intrinsic transferability, coupled with a Divide-and-Conquer Variational Approximation (DVA) strategy to efficiently approximate embedding space evolution. This design enables generalization across a broader range of models and downstream tasks. Extensive experiments on a comprehensive benchmark--spanning extensive training regimes and a wider variety of model types--demonstrate that ITM consistently outperforms existing methods in terms of stability, effectiveness, and efficiency.