CLMay 29, 2025

One Task Vector is not Enough: A Large-Scale Study for In-Context Learning

arXiv:2505.23911v12 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses a fundamental limitation in analyzing in-context learning for researchers, providing large-scale empirical insights but is incremental as it builds on existing hypotheses with new data.

The study tackled the problem of understanding task vectors in in-context learning for large language models by introducing the QuiteAFew dataset with 3,096 tasks, revealing that performance peaks at intermediate layers, varies by task type, and complex tasks require multiple vectors rather than a single one.

In-context learning (ICL) enables Large Language Models (LLMs) to adapt to new tasks using few examples, with task vectors - specific hidden state activations - hypothesized to encode task information. Existing studies are limited by small-scale benchmarks, restricting comprehensive analysis. We introduce QuiteAFew, a novel dataset of 3,096 diverse few-shot tasks, each with 30 input-output pairs derived from the Alpaca dataset. Experiments with Llama-3-8B on QuiteAFew reveal: (1) task vector performance peaks at an intermediate layer (e.g., 15th), (2) effectiveness varies significantly by task type, and (3) complex tasks rely on multiple, subtask-specific vectors rather than a single vector, suggesting distributed task knowledge representation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes