LGAICVNov 5, 2020

Neuron-based explanations of neural networks sacrifice completeness and interpretability

arXiv:2011.03043v3Has Code
AI Analysis

This work addresses the problem of improving explanation methods for neural networks, particularly for researchers and practitioners using models like AlexNet, by highlighting the limitations of neuron-based approaches.

The paper demonstrates that neuron-based explanations for AlexNet on ImageNet are less complete and interpretable than activation principal components, as shown by quantitative measures and a user study, with principal components capturing more variance and being more understandable.

High quality explanations of neural networks (NNs) should exhibit two key properties. Completeness ensures that they accurately reflect a network's function and interpretability makes them understandable to humans. Many existing methods provide explanations of individual neurons within a network. In this work we provide evidence that for AlexNet pretrained on ImageNet, neuron-based explanation methods sacrifice both completeness and interpretability compared to activation principal components. Neurons are a poor basis for AlexNet embeddings because they don't account for the distributed nature of these representations. By examining two quantitative measures of completeness and conducting a user study to measure interpretability, we show the most important principal components provide more complete and interpretable explanations than the most important neurons. Much of the activation variance may be explained by examining relatively few high-variance PCs, as opposed to studying every neuron. These principal components also strongly affect network function, and are significantly more interpretable than neurons. Our findings suggest that explanation methods for networks like AlexNet should avoid using neurons as a basis for embeddings and instead choose a basis, such as principal components, which accounts for the high dimensional and distributed nature of a network's internal representations. Interactive demo and code available at https://ndey96.github.io/neuron-explanations-sacrifice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes