CLAILGJan 22, 2024

Enhancing In-context Learning via Linear Probe Calibration

arXiv:2401.12406v120 citationsh-index: 30Has CodeAISTATS
Originality Incremental advance
AI Analysis

This addresses a scalability and robustness problem for users of ICL in natural language processing, offering an incremental improvement over existing methods.

The paper tackles the unreliability and lack of robustness in in-context learning (ICL) for GPT-like models by proposing Linear Probe Calibration (LinC), which improves test performance by up to 21% on average and up to 50% in some cases, while requiring only minimal labeled data.

In-context learning (ICL) is a new paradigm for natural language processing that utilizes Generative Pre-trained Transformer (GPT)-like models. This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input. However, applying ICL in real cases does not scale with the number of samples, and lacks robustness to different prompt templates and demonstration permutations. In this paper, we first show that GPT-like models using ICL result in unreliable predictions based on a new metric based on Shannon entropy. Then, to solve this problem, we propose a new technique called the Linear Probe Calibration (LinC), a method that calibrates the model's output probabilities, resulting in reliable predictions and improved performance, while requiring only minimal additional samples (as few as five labeled data samples). LinC significantly enhances the ICL test performance of GPT models on various benchmark datasets, with an average improvement of up to 21%, and up to a 50% improvement in some cases, and significantly boosts the performance of PEFT methods, especially in the low resource regime. Moreover, LinC achieves lower expected calibration error, and is highly robust to varying label proportions, prompt templates, and demonstration permutations. Our code is available at \url{https://github.com/mominabbass/LinC}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes