Understanding surrogate explanations: the interplay between complexity, fidelity and coverage
This work addresses the challenge of interpretability in machine learning for users needing transparent AI decisions, but it is incremental as it builds on existing surrogate explanation methods.
The paper analyzes the trade-offs between complexity, fidelity, and coverage in surrogate explanations for black-box models, showing that local surrogates improve the fidelity-complexity Pareto frontier compared to global ones, and presents experiments demonstrating interactive local surrogates for better explanations.
This paper analyses the fundamental ingredients behind surrogate explanations to provide a better understanding of their inner workings. We start our exposition by considering global surrogates, describing the trade-off between complexity of the surrogate and fidelity to the black-box being modelled. We show that transitioning from global to local - reducing coverage - allows for more favourable conditions on the Pareto frontier of fidelity-complexity of a surrogate. We discuss the interplay between complexity, fidelity and coverage, and consider how different user needs can lead to problem formulations where these are either constraints or penalties. We also present experiments that demonstrate how the local surrogate interpretability procedure can be made interactive and lead to better explanations.