CVNov 12, 2023

Concept-wise Fine-tuning Matters in Preventing Negative Transfer

arXiv:2311.06868v13 citationsh-index: 4
Originality Highly original
AI Analysis

This addresses a critical issue for practitioners using fine-tuning in AI applications, offering a novel solution to enhance performance, though it is incremental relative to existing fine-tuning techniques.

The paper tackles the problem of negative transfer in fine-tuning pre-trained models, caused by rare and spuriously correlated features, and proposes Concept-Tuning, which improves state-of-the-art methods by up to 4.76% across eleven datasets.

A multitude of prevalent pre-trained models mark a major milestone in the development of artificial intelligence, while fine-tuning has been a common practice that enables pretrained models to figure prominently in a wide array of target datasets. Our empirical results reveal that off-the-shelf finetuning techniques are far from adequate to mitigate negative transfer caused by two types of underperforming features in a pre-trained model, including rare features and spuriously correlated features. Rooted in structural causal models of predictions after fine-tuning, we propose a Concept-wise fine-tuning (Concept-Tuning) approach which refines feature representations in the level of patches with each patch encoding a concept. Concept-Tuning minimizes the negative impacts of rare features and spuriously correlated features by (1) maximizing the mutual information between examples in the same category with regard to a slice of rare features (a patch) and (2) applying front-door adjustment via attention neural networks in channels and feature slices (patches). The proposed Concept-Tuning consistently and significantly (by up to 4.76%) improves prior state-of-the-art fine-tuning methods on eleven datasets, diverse pre-training strategies (supervised and self-supervised ones), various network architectures, and sample sizes in a target dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes