LGJan 16, 2025

Task Vectors in In-Context Learning: Emergence, Formation, and Benefit

Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert Nowak

arXiv:2501.09240v126.422 citationsh-index: 8

Originality Incremental advance

AI Analysis

This addresses the problem of opaque task encoding in transformers for researchers, offering a controlled method to enhance in-context learning, though it is incremental as it builds on prior findings.

The paper investigates the emergence and formation of task vectors in in-context learning, finding they can be weak or non-local, and proposes a TVP-loss method to encode them strongly at prescribed locations, improving robustness and generalization.

In-context learning is a remarkable capability of transformers, referring to their ability to adapt to specific tasks based on a short history or context. Previous research has found that task-specific information is locally encoded within models, though their emergence and functionality remain unclear due to opaque pre-training processes. In this work, we investigate the formation of task vectors in a controlled setting, using models trained from scratch on synthetic datasets. Our findings confirm that task vectors naturally emerge under certain conditions, but the tasks may be relatively weakly and/or non-locally encoded within the model. To promote strong task vectors encoded at a prescribed location within the model, we propose an auxiliary training mechanism based on a task vector prompting loss (TVP-loss). This method eliminates the need to search for task-correlated encodings within the trained model and demonstrably improves robustness and generalization.

View on arXiv PDF

Similar