LGCLOct 20, 2023

Foundation Model's Embedded Representations May Detect Distribution Shift

arXiv:2310.13836v21 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This addresses the problem of misleading generalization metrics for researchers using foundation models, though it's an incremental analysis of existing methods on known data.

The paper investigates distribution shift between automatically labeled training data and manually curated test data in transfer learning, showing that training on the biased data actually harms performance on the test set by 5-15% compared to using pre-trained representations directly.

Sampling biases can cause distribution shifts between train and test datasets for supervised learning tasks, obscuring our ability to understand the generalization capacity of a model. This is especially important considering the wide adoption of pre-trained foundational neural networks -- whose behavior remains poorly understood -- for transfer learning (TL) tasks. We present a case study for TL on the Sentiment140 dataset and show that many pre-trained foundation models encode different representations of Sentiment140's manually curated test set $M$ from the automatically labeled training set $P$, confirming that a distribution shift has occurred. We argue training on $P$ and measuring performance on $M$ is a biased measure of generalization. Experiments on pre-trained GPT-2 show that the features learnable from $P$ do not improve (and in fact hamper) performance on $M$. Linear probes on pre-trained GPT-2's representations are robust and may even outperform overall fine-tuning, implying a fundamental importance for discerning distribution shift in train/test splits for model interpretation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes