CL AISep 30, 2024

Do Influence Functions Work on Large Language Models?

arXiv:2409.19998v210.821 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This work highlights a critical limitation for researchers and practitioners attempting to use influence functions for data attribution and understanding in LLMs, suggesting the need for new methodologies.

This paper investigates the effectiveness of influence functions when applied to large language models (LLMs) across various tasks. The study concludes that influence functions consistently perform poorly in most LLM settings, attributing this to approximation errors, uncertain fine-tuning convergence, and a fundamental mismatch in how parameter changes relate to LLM behavior.

Influence functions are important for quantifying the impact of individual training data points on a model's predictions. Although extensive research has been conducted on influence functions in traditional machine learning models, their application to large language models (LLMs) has been limited. In this work, we conduct a systematic study to address a key question: do influence functions work on LLMs? Specifically, we evaluate influence functions across multiple tasks and find that they consistently perform poorly in most settings. Our further investigation reveals that their poor performance can be attributed to: (1) inevitable approximation errors when estimating the iHVP component due to the scale of LLMs, (2) uncertain convergence during fine-tuning, and, more fundamentally, (3) the definition itself, as changes in model parameters do not necessarily correlate with changes in LLM behavior. Thus, our study suggests the need for alternative approaches for identifying influential samples.

View on arXiv PDF Code

Similar