Active Few-Shot Fine-Tuning
This addresses the data efficiency problem for practitioners fine-tuning large neural networks, though it appears incremental as it builds on active learning concepts.
The paper tackles the problem of selecting the right data for fine-tuning to a specific task, called active fine-tuning, and shows that ITL, an information-based transductive learning approach, learns tasks with significantly fewer examples than state-of-the-art methods.
We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained about the specified task. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We apply ITL to the few-shot fine-tuning of large neural networks and show that fine-tuning with ITL learns the task with significantly fewer examples than the state-of-the-art.