CVOct 15, 2022

Decoupling Deep Learning for Interpretable Image Recognition

arXiv:2210.08336v36 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the problem of balancing interpretability and performance in image recognition for researchers and practitioners, though it is incremental as it builds on existing prototype-based methods.

The paper tackled the trade-off between accuracy and interpretability in prototype-based neural networks by proposing DProtoNet, which decouples inference and interpretation modules, achieving a 5% accuracy improvement and state-of-the-art interpretability on multiple datasets.

The interpretability of neural networks has recently received extensive attention. Previous prototype-based explainable networks involved prototype activation in both reasoning and interpretation processes, requiring specific explainable structures for the prototype, thus making the network less accurate as it gains interpretability. Therefore, the decoupling prototypical network (DProtoNet) was proposed to avoid this problem. This new model contains encoder, inference, and interpretation modules. As regards the encoder module, unrestricted feature masks were presented to generate expressive features and prototypes. Regarding the inference module, a multi-image prototype learning method was introduced to update prototypes so that the network can learn generalized prototypes. Finally, concerning the interpretation module, a multiple dynamic masks (MDM) decoder was suggested to explain the neural network, which generates heatmaps using the consistent activation of the original image and mask image at the detection nodes of the network. It decouples the inference and interpretation modules of a prototype-based network by avoiding the use of prototype activation to explain the network's decisions in order to simultaneously improve the accuracy and interpretability of the neural network. The multiple public general and medical datasets were tested, and the results confirmed that our method could achieve a 5% improvement in accuracy and state-of-the-art interpretability compared with previous methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes