CLJun 19, 2024

In-Context Learning on a Budget: A Case Study in Token Classification

arXiv:2406.13274v211 citations
Originality Synthesis-oriented
AI Analysis

This addresses the practical challenge of domain adaptation with constrained annotation resources for token classification tasks, though it is incremental as it applies existing methods to a new setup.

The paper tackles the problem of few-shot in-context learning for token classification under limited annotation budgets, finding that no sample selection method significantly outperforms others, including random selection, and that small annotated pools can achieve performance comparable to using the entire training set.

Few shot in-context learning (ICL) typically assumes access to large annotated training sets. However, in many real world scenarios, such as domain adaptation, there is only a limited budget to annotate a small number of samples, with the goal of maximizing downstream performance. We study various methods for selecting samples to annotate within a predefined budget, focusing on token classification tasks, which are expensive to annotate and are relatively less studied in ICL setups. Across various tasks, models, and datasets, we observe that no method significantly outperforms the others, with most yielding similar results, including random sample selection for annotation. Moreover, we demonstrate that a relatively small annotated sample pool can achieve performance comparable to using the entire training set. We hope that future work adopts our realistic paradigm which takes annotation budget into account.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes