Saliency-driven Class Impressions for Feature Visualization of Deep Neural Networks
This work addresses the need for interpretable AI in critical applications by providing a method to visualize essential classification features, though it is incremental as it builds on existing visualization techniques.
The paper tackles the problem of visualizing discriminative features in deep neural networks by proposing a saliency-driven, data-free method that extracts single-object class impressions, resulting in clearer and higher-confidence visualizations compared to existing methods that produce cluttered images.
In this paper, we propose a data-free method of extracting Impressions of each class from the classifier's memory. The Deep Learning regime empowers classifiers to extract distinct patterns (or features) of a given class from training data, which is the basis on which they generalize to unseen data. Before deploying these models on critical applications, it is advantageous to visualize the features considered to be essential for classification. Existing visualization methods develop high confidence images consisting of both background and foreground features. This makes it hard to judge what the crucial features of a given class are. In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task. Another drawback of existing methods is that confidence of the generated visualizations is increased by creating multiple instances of the given class. We restrict the algorithm to develop a single object per image, which helps further in extracting features of high confidence and also results in better visualizations. We further demonstrate the generation of negative images as naturally fused images of two or more classes.