CVLGSep 4, 2021

Robust fine-tuning of zero-shot models

arXiv:2109.01903v3984 citations
Originality Incremental advance
AI Analysis

This addresses the robustness issue in fine-tuning for practitioners using pre-trained models, though it is incremental as it builds on existing fine-tuning methods.

The paper tackles the problem of fine-tuning zero-shot models like CLIP or ALIGN, which often reduces robustness to distribution shifts, by introducing WiSE-FT, a method that ensembles weights from zero-shot and fine-tuned models, resulting in accuracy improvements of 4 to 6 percentage points under distribution shift and up to 3.3 pp on transfer learning datasets.

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve accuracy on a given target distribution, they often reduce robustness to distribution shifts. We address this tension by introducing a simple and effective method for improving robustness while fine-tuning: ensembling the weights of the zero-shot and fine-tuned models (WiSE-FT). Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution. On ImageNet and five derived distribution shifts, WiSE-FT improves accuracy under distribution shift by 4 to 6 percentage points (pp) over prior work while increasing ImageNet accuracy by 1.6 pp. WiSE-FT achieves similarly large robustness gains (2 to 23 pp) on a diverse set of six further distribution shifts, and accuracy gains of 0.8 to 3.3 pp compared to standard fine-tuning on seven commonly used transfer learning datasets. These improvements come at no additional computational cost during fine-tuning or inference.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes