Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis
This provides an efficient solution for explaining black-box models in vision and language tasks, with incremental improvements in speed and accuracy over existing methods.
The paper tackles the problem of efficiently explaining black-box neural network predictions by proposing a novel attribution method based on Sobol indices from sensitivity analysis, which captures individual and higher-order interactions of image regions through variance, resulting in favorable scores on standard benchmarks and drastically reduced computing time compared to other black-box methods, even surpassing the accuracy of state-of-the-art white-box methods.
We describe a novel attribution method which is grounded in Sensitivity Analysis and uses Sobol indices. Beyond modeling the individual contributions of image regions, Sobol indices provide an efficient way to capture higher-order interactions between image regions and their contributions to a neural network's prediction through the lens of variance. We describe an approach that makes the computation of these indices efficient for high-dimensional problems by using perturbation masks coupled with efficient estimators to handle the high dimensionality of images. Importantly, we show that the proposed method leads to favorable scores on standard benchmarks for vision (and language models) while drastically reducing the computing time compared to other black-box methods -- even surpassing the accuracy of state-of-the-art white-box methods which require access to internal representations. Our code is freely available: https://github.com/fel-thomas/Sobol-Attribution-Method