CLAILGMay 22, 2024

DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning

arXiv:2405.14899v27 citationsh-index: 39NIPS
AI Analysis

This addresses the need for interpretability in ICL, a key paradigm for flexible AI, but it is incremental as it builds on existing influence function methods.

The paper tackles the problem of interpreting in-context learning (ICL) in transformers by proposing DETAIL, an influence function-based attribution technique for task demonstrations, which is empirically verified as effective and computationally efficient, and it improves model performance through demonstration reordering and curation, with transferability shown from white-box to black-box models.

In-context learning (ICL) allows transformer-based language models that are pre-trained on general text to quickly learn a specific task with a few "task demonstrations" without updating their parameters, significantly boosting their flexibility and generality. ICL possesses many distinct characteristics from conventional machine learning, thereby requiring new approaches to interpret this learning paradigm. Taking the viewpoint of recent works showing that transformers learn in context by formulating an internal optimizer, we propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL. We empirically verify the effectiveness of our approach for demonstration attribution while being computationally efficient. Leveraging the results, we then show how DETAIL can help improve model performance in real-world scenarios through demonstration reordering and curation. Finally, we experimentally prove the wide applicability of DETAIL by showing our attribution scores obtained on white-box models are transferable to black-box models in improving model performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes