CLMay 17, 2024

Feature-Adaptive and Data-Scalable In-Context Learning

arXiv:2405.10738v229 citationsh-index: 8Has CodeACL
Originality Incremental advance
AI Analysis

This work addresses the problem of improving in-context learning for downstream tasks in large language models, offering a method that is data-scalable and feature-adaptive, though it appears incremental as it builds on existing ICL paradigms.

The paper tackles the limitations of in-context learning (ICL) in large language models, such as context length constraints and lack of task adaptation, by proposing FADS-ICL, a framework that uses task-adaptive features and beyond-context samples to improve inference; it shows significant performance gains, e.g., +14.3 average accuracy over vanilla ICL and +6.2 over previous state-of-the-art methods under specific settings.

In-context learning (ICL), which promotes inference with several demonstrations, has become a widespread paradigm to stimulate LLM capabilities for downstream tasks. Due to context length constraints, it cannot be further improved in spite of more training data, and general features directly from LLMs in ICL are not adaptive to the specific downstream task. In this paper, we propose a feature-adaptive and data-scalable in-context learning framework (FADS-ICL), which can leverage task-adaptive features to promote inference on the downstream task, with the supervision of beyond-context samples. Specifically, it first extracts general features of beyond-context samples via the LLM with ICL input form one by one, and introduces a task-specific modulator to perform feature refinement and prediction after fitting a specific downstream task. We conduct extensive experiments on FADS-ICL under varying data settings (4$\sim$128 shots) and LLM scale (0.8$\sim$70B) settings. Experimental results show that FADS-ICL consistently outperforms previous state-of-the-art methods by a significant margin under all settings, verifying the effectiveness and superiority of FADS-ICL. For example, under the 1.5B and 32 shots setting, FADS-ICL can achieve \textbf{+14.3} average accuracy from feature adaptation over vanilla ICL on 10 datasets, with \textbf{+6.2} average accuracy over the previous state-of-the-art method, and the performance can further improve with increasing training data. Code and data are publicly available at \url{https://github.com/jiahaozhenbang/FADS-ICL}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes