Towards Budget-Friendly Model-Agnostic Explanation Generation for Large Language Models
This addresses the problem of expensive model-agnostic explanation generation for LLMs, offering a cost-effective solution for users needing interpretability, though it is incremental as it builds on existing techniques.
The paper tackles the high economic cost of generating faithful explanations for large language models (LLMs) by proposing a method that uses budget-friendly models for sampling, achieving practical results with proxy explanations that perform well on downstream tasks.
With Large language models (LLMs) becoming increasingly prevalent in various applications, the need for interpreting their predictions has become a critical challenge. As LLMs vary in architecture and some are closed-sourced, model-agnostic techniques show great promise without requiring access to the model's internal parameters. However, existing model-agnostic techniques need to invoke LLMs many times to gain sufficient samples for generating faithful explanations, which leads to high economic costs. In this paper, we show that it is practical to generate faithful explanations for large-scale LLMs by sampling from some budget-friendly models through a series of empirical studies. Moreover, we show that such proxy explanations also perform well on downstream tasks. Our analysis provides a new paradigm of model-agnostic explanation methods for LLMs, by including information from budget-friendly models.