NeuroView: Explainable Deep Network Decision Making
This addresses the issue of explainability in deep learning for researchers and practitioners, though it appears incremental as it modifies existing architectures for interpretability.
The authors tackled the problem of understanding which units in deep neural networks contribute to specific decisions by introducing NeuroView, a family of architectures that are interpretable by design, resulting in a direct causal link between unit states and classification decisions validated on standard datasets.
Deep neural networks (DNs) provide superhuman performance in numerous computer vision tasks, yet it remains unclear exactly which of a DN's units contribute to a particular decision. NeuroView is a new family of DN architectures that are interpretable/explainable by design. Each member of the family is derived from a standard DN architecture by vector quantizing the unit output values and feeding them into a global linear classifier. The resulting architecture establishes a direct, causal link between the state of each unit and the classification decision. We validate NeuroView on standard datasets and classification tasks to show that how its unit/class mapping aids in understanding the decision-making process.