Towards Visual Saliency Explanations of Face Verification
This work addresses the need for interpretability in face recognition systems, which is crucial for trust and fairness in applications like security and surveillance, though it is incremental as it builds on existing saliency map techniques.
The paper tackles the lack of explainability in deep face verification systems by proposing CorrRISE, a model-agnostic method that generates saliency maps to highlight similar and dissimilar regions in face image pairs, showing promising results compared to state-of-the-art approaches.
In the past years, deep convolutional neural networks have been pushing the frontier of face recognition (FR) techniques in both verification and identification scenarios. Despite the high accuracy, they are often criticized for lacking explainability. There has been an increasing demand for understanding the decision-making process of deep face recognition systems. Recent studies have investigated the usage of visual saliency maps as an explanation, but they often lack a discussion and analysis in the context of face recognition. This paper concentrates on explainable face verification tasks and conceives a new explanation framework. Firstly, a definition of the saliency-based explanation method is provided, which focuses on the decisions made by the deep FR model. Secondly, a new model-agnostic explanation method named CorrRISE is proposed to produce saliency maps, which reveal both the similar and dissimilar regions of any given pair of face images. Then, an evaluation methodology is designed to measure the performance of general visual saliency explanation methods in face verification. Finally, substantial visual and quantitative results have shown that the proposed CorrRISE method demonstrates promising results in comparison with other state-of-the-art explainable face verification approaches.