CVSep 25, 2023
Multiple Different Black Box Explanations for Image ClassifiersHana Chockler, David A. Kelly, Daniel Kroening
Existing explanation tools for image classifiers usually give only a single explanation for an image's classification. For many images, however, image classifiers accept more than one explanation for the image label. These explanations are useful for analyzing the decision process of the classifier and for detecting errors. Thus, restricting the number of explanations to just one severely limits insight into the behavior of the classifier. In this paper, we describe an algorithm and a tool, MultEX, for computing multiple explanations as the output of a black-box image classifier for a given image. Our algorithm uses a principled approach based on actual causality. We analyze its theoretical complexity and evaluate MultEX against the state-of-the-art across three different models and three different datasets. We find that MultEX finds more explanations and that these explanations are of higher quality.
CVNov 23, 2023
You Only Explain OnceDavid A. Kelly, Hana Chockler, Daniel Kroening et al.
In this paper, we propose a new black-box explainability algorithm and tool, YO-ReX, for efficient explanation of the outputs of object detectors. The new algorithm computes explanations for all objects detected in the image simultaneously. Hence, compared to the baseline, the new algorithm reduces the number of queries by a factor of 10X for the case of ten detected objects. The speedup increases further with with the number of objects. Our experimental results demonstrate that YO-ReX can explain the outputs of YOLO with a negligible overhead over the running time of YOLO. We also demonstrate similar results for explaining SSD and Faster R-CNN. The speedup is achieved by avoiding backtracking by combining aggressive pruning with a causal analysis.
CVNov 24, 2023
MRxaI: Black-Box Explainability for Image Classifiers in a Medical SettingNathan Blake, Hana Chockler, David A. Kelly et al.
Existing tools for explaining the output of image classifiers can be divided into white-box, which rely on access to the model internals, and black-box, agnostic to the model. As the usage of AI in the medical domain grows, so too does the usage of explainability tools. Existing work on medical image explanations focuses on white-box tools, such as gradcam. However, there are clear advantages to switching to a black-box tool, including the ability to use it with any classifier and the wide selection of black-box tools available. On standard images, black-box tools are as precise as white-box. In this paper we compare the performance of several black-box methods against gradcam on a brain cancer MRI dataset. We demonstrate that most black-box tools are not suitable for explaining medical image classifications and present a detailed analysis of the reasons for their shortcomings. We also show that one black-box tool, a causal explainability-based rex, performs as well as \gradcam.
SDJan 23
I Guess That's Why They Call it the Blues: Causal Analysis for Audio ClassifiersDavid A. Kelly, Hana Chockler
It is well-known that audio classifiers often rely on non-musically relevant features and spurious correlations to classify audio. Hence audio classifiers are easy to manipulate or confuse, resulting in wrong classifications. While inducing a misclassification is not hard, until now the set of features that the classifiers rely on was not well understood. In this paper we introduce a new method that uses causal reasoning to discover features of the frequency space that are sufficient and necessary for a given classification. We describe an implementation of this algorithm in the tool FreqReX and provide experimental results on a number of standard benchmark datasets. Our experiments show that causally sufficient and necessary subsets allow us to manipulate the outputs of the models in a variety of ways by changing the input very slightly. Namely, a change to one out of 240,000 frequencies results in a change in classification 58% of the time, and the change can be so small that it is practically inaudible. These results show that causal analysis is useful for understanding the reasoning process of audio classifiers and can be used to successfully manipulate their outputs.
25.4SDApr 3
If It's Good Enough for You, It's Good Enough for Me: Transferability of Audio Sufficiencies across ModelsDavid A. Kelly, Hana Chockler
In order to gain fresh insights about the information processing characteristics of different audio classification models, we propose transferability analysis. Given a minimal, sufficient signal for a classification on a model $f$, transferability analysis asks whether other models accept this minimal signal as having the same classification as it did on $f$. We define what it means for a sufficient signal to be transferable and perform a large study over $3$ different classification tasks: music genre, emotion recognition and deepfake detection. We find that transferability rates vary depending on the task, with sufficient signals for music genre being transferable $\approx26\%$ of the time. The other tasks reveal much higher variance in transferability and reveal that some models, in particular on deepfake detection, have different transferability behavior. We call these models `flat-earther' models. We investigate deepfake audio in more depth, and show that transferability analysis also allows to us to discover information theoretic differences between the models which are not captured by the more familiar metrics of accuracy and precision.
CVAug 21, 2024
Real-Time Incremental Explanations for Object Detectors in Autonomous DrivingSantiago Calderón-Peña, Hana Chockler, David A. Kelly
Object detectors are widely used in safety-critical real-time applications such as autonomous driving. Explainability is especially important for safety-critical applications, and due to the variety of object detectors and their often proprietary nature, black-box explainability tools are needed. However, existing black-box explainability tools for AI models rely on multiple model calls, rendering them impractical for real-time use. In this paper, we introduce IncX, an algorithm and a tool for real-time black-box explainability for object detectors. The algorithm is based on linear transformations of saliency maps, producing sufficient explanations. We evaluate our implementation on four widely used video datasets of autonomous driving and demonstrate that IncX's explanations are comparable in quality to the state-of-the-art and are computed two orders of magnitude faster than the state-of-the-art, making them usable in real time.
CVDec 3, 2025
Out-of-the-box: Black-box Causal Attacks on Object DetectorsMelane Navaratnarajah, David A. Kelly, Hana Chockler
Adversarial perturbations are a useful way to expose vulnerabilities in object detectors. Existing perturbation methods are frequently white-box and architecture specific. More importantly, while they are often successful, it is rarely clear why they work. Insights into the mechanism of this success would allow developers to understand and analyze these attacks, as well as fine-tune the model to prevent them. This paper presents BlackCAtt, a black-box algorithm and a tool, which uses minimal, causally sufficient pixel sets to construct explainable, imperceptible, reproducible, architecture-agnostic attacks on object detectors. BlackCAtt combines causal pixels with bounding boxes produced by object detectors to create adversarial attacks that lead to the loss, modification or addition of a bounding box. BlackCAtt works across different object detectors of different sizes and architectures, treating the detector as a black box. We compare the performance of BlackCAtt with other black-box attack methods and show that identification of causal pixels leads to more precisely targeted and less perceptible attacks. On the COCO test dataset, our approach is 2.7 times better than the baseline in removing a detection, 3.86 times better in changing a detection, and 5.75 times better in triggering new, spurious, detections. The attacks generated by BlackCAtt are very close to the original image, and hence imperceptible, demonstrating the power of causal pixels.
AINov 13, 2024
Causal Explanations for Image ClassifiersHana Chockler, David A. Kelly, Daniel Kroening et al.
Existing algorithms for explaining the output of image classifiers use different definitions of explanations and a variety of techniques to extract them. However, none of the existing tools use a principled approach based on formal definitions of causes and explanations for the explanation extraction. In this paper we present a novel black-box approach to computing explanations grounded in the theory of actual causality. We prove relevant theoretical results and present an algorithm for computing approximate explanations based on these definitions. We prove termination of our algorithm and discuss its complexity and the amount of approximation compared to the precise definition. We implemented the framework in a tool ReX and we present experimental results and a comparison with state-of-the-art tools. We demonstrate that \rex is the most efficient tool and produces the smallest explanations, in addition to outperforming other black-box tools on standard quality measures.
CVJul 31, 2025
I Am Big, You Are Little; I Am Right, You Are WrongDavid A. Kelly, Akchunya Chanchal, Nathan Blake
Machine learning for image classification is an active and rapidly developing field. With the proliferation of classifiers of different sizes and different architectures, the problem of choosing the right model becomes more and more important. While we can assess a model's classification accuracy statistically, our understanding of the way these models work is unfortunately limited. In order to gain insight into the decision-making process of different vision models, we propose using minimal sufficient pixels sets to gauge a model's `concentration': the pixels that capture the essence of an image through the lens of the model. By comparing position, overlap, and size of sets of pixels, we identify that different architectures have statistically different concentration, in both size and position. In particular, ConvNext and EVA models differ markedly from the others. We also identify that images which are misclassified are associated with larger pixels sets than correct classifications.
LGMar 18, 2025
SpecReX: Explainable AI for Raman SpectroscopyNathan Blake, David A. Kelly, Akchunya Chanchal et al.
Raman spectroscopy is becoming more common for medical diagnostics with deep learning models being increasingly used to leverage its full potential. However, the opaque nature of such models and the sensitivity of medical diagnosis together with regulatory requirements necessitate the need for explainable AI tools. We introduce SpecReX, specifically adapted to explaining Raman spectra. SpecReX uses the theory of actual causality to rank causal responsibility in a spectrum, quantified by iteratively refining mutated versions of the spectrum and testing if it retains the original classification. The explanations provided by SpecReX take the form of a responsibility map, highlighting spectral regions most responsible for the model to make a correct classification. To assess the validity of SpecReX, we create increasingly complex simulated spectra, in which a "ground truth" signal is seeded, to train a classifier. We then obtain SpecReX explanations and compare the results with another explainability tool. By using simulated spectra we establish that SpecReX localizes to the known differences between classes, under a number of conditions. This provides a foundation on which we can find the spectral features which differentiate disease classes. This is an important first step in proving the validity of SpecReX.
AIOct 1, 2025
Activation-Deactivation: A General Framework for Robust Post-hoc Explainable AIAkchunya Chanchal, David A. Kelly, Hana Chockler
Black-box explainability methods are popular tools for explaining the decisions of image classifiers. A major drawback of these tools is their reliance on mutants obtained by occluding parts of the input, leading to out-of-distribution images. This raises doubts about the quality of the explanations. Moreover, choosing an appropriate occlusion value often requires domain knowledge. In this paper we introduce a novel forward-pass paradigm Activation-Deactivation (AD), which removes the effects of occluded input features from the model's decision-making by switching off the parts of the model that correspond to the occlusions. We introduce ConvAD, a drop-in mechanism that can be easily added to any trained Convolutional Neural Network (CNN), and which implements the AD paradigm. This leads to more robust explanations without any additional training or fine-tuning. We prove that the ConvAD mechanism does not change the decision-making process of the network. We provide experimental evaluation across several datasets and model architectures. We compare the quality of AD-explanations with explanations achieved using a set of masking values, using the proxies of robustness, size, and confidence drop-off. We observe a consistent improvement in robustness of AD explanations (up to 62.5%) compared to explanations obtained with occlusions, demonstrating that ConvAD extracts more robust explanations without the need for domain knowledge.
AISep 12, 2025
Evaluation of Black-Box XAI Approaches for Predictors of Values of Boolean FormulaeStav Armoni-Friedmann, Hana Chockler, David A. Kelly
Evaluating explainable AI (XAI) approaches is a challenging task in general, due to the subjectivity of explanations. In this paper, we focus on tabular data and the specific use case of AI models predicting the values of Boolean functions. We extend the previous work in this domain by proposing a formal and precise measure of importance of variables based on actual causality, and we evaluate state-of-the-art XAI tools against this measure. We also present a novel XAI tool B-ReX, based on the existing tool ReX, and demonstrate that it is superior to other black-box XAI tools on a large-scale benchmark. Specifically, B-ReX achieves a Jensen-Shannon divergence of 0.072 $\pm$ 0.012 on random 10-valued Boolean formulae
IVFeb 14, 2025
3D ReX: Causal Explanations in 3D Neuroimaging ClassificationMelane Navaratnarajah, Sophie A. Martin, David A. Kelly et al.
Explainability remains a significant problem for AI models in medical imaging, making it challenging for clinicians to trust AI-driven predictions. We introduce 3D ReX, the first causality-based post-hoc explainability tool for 3D models. 3D ReX uses the theory of actual causality to generate responsibility maps which highlight the regions most crucial to the model's decision. We test 3D ReX on a stroke detection model, providing insight into the spatial distribution of features relevant to stroke.