LGMLMay 11, 2021

Leveraging Sparse Linear Layers for Debuggable Deep Networks

arXiv:2105.04857v1102 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the need for more interpretable AI models in vision and language tasks, though it is incremental as it builds on existing sparse linear methods.

The paper tackles the problem of making deep networks more debuggable by fitting sparse linear models over learned features, resulting in networks that maintain high accuracy while improving interpretability, as demonstrated through numerical and human experiments.

We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantiatively via numerical and human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks. The code for our toolkit can be found at https://github.com/madrylab/debuggabledeepnetworks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes