LGMLJun 21, 2018

On the Robustness of Interpretability Methods

arXiv:1806.08049v1642 citations
AI Analysis

This addresses the reliability of interpretability methods for users in AI and machine learning, but it is incremental as it builds on existing approaches.

The paper tackles the problem of ensuring that interpretability methods produce similar explanations for similar inputs, and demonstrates that current methods perform poorly on this robustness metric while proposing ways to enforce robustness.

We argue that robustness of explanations---i.e., that similar inputs should give rise to similar explanations---is a key desideratum for interpretability. We introduce metrics to quantify robustness and demonstrate that current methods do not perform well according to these metrics. Finally, we propose ways that robustness can be enforced on existing interpretability approaches.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes