LGCVFeb 28

Benchmarking Few-shot Transferability of Pre-trained Models with Improved Evaluation Protocols

Xu Luo, Ji Zhang, Lianli Gao, Heng Tao Shen, Jingkuan Song
arXiv:2603.00478v1Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of unreliable benchmarking in few-shot transfer learning for researchers, providing a rigorous tool to streamline reproducible advances.

The authors tackled the lack of a unified evaluation protocol for few-shot transfer learning by introducing FEWTRANS, a benchmark with 10 datasets, and found that pre-trained model choice is the dominant performance factor, with full fine-tuning often outperforming sophisticated methods.

Few-shot transfer has been revolutionized by stronger pre-trained models and improved adaptation algorithms.However, there lacks a unified, rigorous evaluation protocol that is both challenging and realistic for real-world usage. In this work, we establish FEWTRANS, a comprehensive benchmark containing 10 diverse datasets, and propose the Hyperparameter Ensemble (HPE) protocol to overcome the "validation set illusion" in data-scarce regimes. Our empirical findings demonstrate that the choice of pre-trained model is the dominant factor for performance, while many sophisticated transfer methods offer negligible practical advantages over a simple full-parameter fine-tuning baseline. To explain this surprising effectiveness, we provide an in-depth mechanistic analysis showing that full fine-tuning succeeds via distributed micro-adjustments and more flexible reshaping of high-level semantic presentations without suffering from overfitting. Additionally, we quantify the performance collapse of multimodal models in specialized domains as a result of linguistic rarity using adjusted Zipf frequency scores. By releasing FEWTRANS, we aim to provide a rigorous "ruler" to streamline reproducible advances in few-shot transfer learning research. We make the FEWTRANS benchmark publicly available at https://github.com/Frankluox/FewTrans.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes