Efficient Contrastive Explanations on Demand
This work addresses the efficiency of generating symbolic explanations for complex models, which is important for improving interpretability in AI applications, though it is incremental as it builds on existing connections between adversarial robustness and explanations.
The paper tackles the performance bottleneck in computing contrastive explanations for machine learning models with many features by proposing novel algorithms that leverage adversarial robustness, resulting in demonstrated performance gains.
Recent work revealed a tight connection between adversarial robustness and restricted forms of symbolic explanations, namely distance-based (formal) explanations. This connection is significant because it represents a first step towards making the computation of symbolic explanations as efficient as deciding the existence of adversarial examples, especially for highly complex machine learning (ML) models. However, a major performance bottleneck remains, because of the very large number of features that ML models may possess, in particular for deep neural networks. This paper proposes novel algorithms to compute the so-called contrastive explanations for ML models with a large number of features, by leveraging on adversarial robustness. Furthermore, the paper also proposes novel algorithms for listing explanations and finding smallest contrastive explanations. The experimental results demonstrate the performance gains achieved by the novel algorithms proposed in this paper.