When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models
This provides a practical method for enhancing ICL performance in NLP applications, though it is incremental as it builds on existing model analysis techniques.
The paper tackles the problem of improving in-context learning (ICL) in large language models by analyzing individual components like attention heads and MLPs, finding that some components outperform the full model. It proposes component reweighting, which increases accuracy by an average of 6.0% points over 24-shot ICL across 8 tasks on Llama-2-7B.
This paper studies in-context learning by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that always predict the same label. We find that component accuracies are well-correlated across different demonstration sets and perturbations of prompt templates. Based on our findings, we propose component reweighting, which learns to linearly re-scale the component activations from a few labeled examples. Given 24 labeled examples, our method improves by an average of 6.0% accuracy points over 24-shot ICL across 8 tasks on Llama-2-7B. Overall, this paper both enriches our understanding of ICL and provides a practical method for improvement by examining model internals.