ProtoMask: Segmentation-Guided Prototype Learning
This work addresses the need for more truthful visual explanations in AI for domains requiring fine-grained classification, though it is incremental as it builds on existing prototype methods.
The paper tackled the problem of unreliable saliency maps in prototype-based explainable AI by using segmentation foundation models to restrict saliency computation to predefined semantic patches, resulting in competitive performance on fine-grained classification datasets with unique explainability features.
XAI gained considerable importance in recent years. Methods based on prototypical case-based reasoning have shown a promising improvement in explainability. However, these methods typically rely on additional post-hoc saliency techniques to explain the semantics of learned prototypes. Multiple critiques have been raised about the reliability and quality of such techniques. For this reason, we study the use of prominent image segmentation foundation models to improve the truthfulness of the mapping between embedding and input space. We aim to restrict the computation area of the saliency map to a predefined semantic image patch to reduce the uncertainty of such visualizations. To perceive the information of an entire image, we use the bounding box from each generated segmentation mask to crop the image. Each mask results in an individual input in our novel model architecture named ProtoMask. We conduct experiments on three popular fine-grained classification datasets with a wide set of metrics, providing a detailed overview on explainability characteristics. The comparison with other popular models demonstrates competitive performance and unique explainability features of our model. https://github.com/uos-sis/quanproto