TTE-CAM: Built-in Class Activation Maps for Test-Time Explainability in Pretrained Black-Box CNNs
For medical image analysis, TTE-CAM offers a practical solution to obtain faithful explanations without sacrificing accuracy, addressing the trade-off between interpretability and performance.
TTE-CAM converts pretrained black-box CNNs into self-explainable models by replacing the classification head with a convolution-based module initialized from original weights, preserving predictive performance while providing built-in faithful explanations competitive with post-hoc methods.
Convolutional neural networks (CNNs) achieve state-of-the-art performance in medical image analysis yet remain opaque, limiting adoption in high-stakes clinical settings. Existing approaches face a fundamental trade-off: post-hoc methods provide unfaithful approximate explanations, while inherently interpretable architectures are faithful but often sacrifice predictive performance. We introduce TTE-CAM, a test-time framework that bridges this gap by converting pretrained black-box CNNs into self-explainable models via a convolution-based replacement of their classification head, initialized from the original weights. The resulting model preserves black-box predictive performance while delivering built-in faithful explanations competitive with post-hoc methods, both qualitatively and quantitatively. The code is available at https://github.com/kdjoumessi/Test-Time-Explainability