CVLGMLAug 16, 2019

Gradient Weighted Superpixels for Interpretability in CNNs

arXiv:1908.08997v11 citations
AI Analysis

This work addresses the need for interpretable CNNs in applications like video analysis, offering a faster alternative to existing methods, though it is incremental as it builds on gradient-based and superpixel techniques.

The paper tackles the trade-off between computational efficiency and interpretability in CNNs, particularly for large inputs like video, by introducing gradient-weighted superpixels to approximate LIME's explanations in a fraction of the time, as demonstrated on ImageNet and Kinetics-400 datasets.

As Convolutional Neural Networks embed themselves into our everyday lives, the need for them to be interpretable increases. However, there is often a trade-off between methods that are efficient to compute but produce an explanation that is difficult to interpret, and those that are slow to compute but provide a more interpretable result. This is particularly challenging in problem spaces that require a large input volume, especially video which combines both spatial and temporal dimensions. In this work we introduce the idea of scoring superpixels through the use of gradient based pixel scoring techniques. We show qualitatively and quantitatively that this is able to approximate LIME, in a fraction of the time. We investigate our techniques using both image classification, and action recognition networks on large scale datasets (ImageNet and Kinetics-400 respectively).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes