LGOct 5, 2023

Fine-tune Language Models to Approximate Unbiased In-context Learning

arXiv:2310.03331v114.917 citationsh-index: 8

Originality Incremental advance

AI Analysis

This addresses performance degradation in ICL for users of large language models due to prompt biases, presenting an incremental method to mitigate this issue.

The paper tackles the problem of biased or imbalanced input prompts degrading performance in in-context learning (ICL) for large language models, introducing RICL and LARICL algorithms that fine-tune models to approximate unbiased ICL, with experiments showing substantial improvement over benchmarks like casual prompt-based ICL and classic fine-tuning.

In-context learning (ICL) is an astonishing emergent ability of large language models (LLMs). By presenting a prompt that includes multiple input-output pairs as examples and introducing a new query input, models can generate the corresponding output. However, the performance of models heavily relies on the quality of the input prompt when implementing in-context learning. Biased or imbalanced input prompts can significantly degrade the performance of language models. To address this issue, we introduce a reweighted algorithm called RICL (Reweighted In-context Learning). This algorithm fine-tunes language models using an unbiased validation set to determine the optimal weight for each input-output example to approximate unbiased in-context learning. Furthermore, we also introduce a low-cost reweighted algorithm, a linear optimal weight approximation algorithm called LARICL (Linear Approximation of Reweighted In-context Learning). This algorithm requires minimal training cost while providing effective results. We prove the convergence of our algorithm and validate its performance through experiments conducted on a numerical dataset. The experimental findings reveal a substantial improvement in comparison to benchmarks including the performance of casual prompt-based in-context learning and the performance of a classic fine-tuning method.

View on arXiv PDF

Similar