CL AI LGJan 9, 2025

Unlocking In-Context Learning for Natural Datasets Beyond Language Modelling

Jelena Bratulić, Sudhanshu Mittal, David T. Hoffmann, Samuel Böhm, Robin Tibor Schirrmeister, Tonio Ball, Christian Rupprecht, Thomas Brox

arXiv:2501.06256v38.34 citationsh-index: 17DAGM GCPR

Originality Incremental advance

AI Analysis

This work addresses the challenge of extending ICL to non-text domains, which is incremental as it builds on known LLM properties to improve adaptation for specific tasks like EEG classification.

The paper tackled the problem of enabling In-Context Learning (ICL) for modalities beyond text, such as visual and EEG datasets, by identifying factors like token repetitions and training task difficulty that support ICL emergence in autoregressive models, resulting in unlocked ICL capabilities for these datasets.

Large Language Models (LLMs) exhibit In-Context Learning (ICL), which enables the model to perform new tasks conditioning only on the examples provided in the context without updating the model's weights. While ICL offers fast adaptation across natural language tasks and domains, its emergence is less straightforward for modalities beyond text. In this work, we systematically uncover properties present in LLMs that support the emergence of ICL for autoregressive models and various modalities by promoting the learning of the needed mechanisms for ICL. We identify exact token repetitions in the training data sequences as an important factor for ICL. Such repetitions further improve stability and reduce transiency in ICL performance. Moreover, we emphasise the significance of training task difficulty for the emergence of ICL. Finally, by applying our novel insights on ICL emergence, we unlock ICL capabilities for various visual datasets and a more challenging EEG classification task.

View on arXiv PDF

Similar