BELLA: Black box model Explanations by Local Linear Approximations
This addresses the need for better interpretability in regression models, particularly for legal compliance and performance assessment, though it appears incremental as it builds on existing post-hoc explanation methods.
The paper tackles the problem of unreliable and narrow post-hoc explanations for regression black-box models by introducing BELLA, a deterministic model-agnostic approach that provides linear explanations, resulting in explanations that are accurate, simple, general, and robust by maximizing neighborhood size.
Understanding the decision-making process of black-box models has become not just a legal requirement, but also an additional way to assess their performance. However, the state of the art post-hoc explanation approaches for regression models rely on synthetic data generation, which introduces uncertainty and can hurt the reliability of the explanations. Furthermore, they tend to produce explanations that apply to only very few data points. In this paper, we present BELLA, a deterministic model-agnostic post-hoc approach for explaining the individual predictions of regression black-box models. BELLA provides explanations in the form of a linear model trained in the feature space. BELLA maximizes the size of the neighborhood to which the linear model applies so that the explanations are accurate, simple, general, and robust.