LGAIITOct 12, 2021

A Rate-Distortion Framework for Explaining Black-box Model Decisions

arXiv:2110.08252v116 citations
Originality Incremental advance
AI Analysis

This addresses the need for interpretability in AI for users dealing with complex models, though it appears incremental as it builds on existing perturbation-based explanation methods.

The paper tackles the problem of explaining black-box model decisions by introducing the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method based on input perturbations applicable to any differentiable pre-trained model, with experiments showing adaptability across images, audio, and urban simulations.

We present the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method for explaining black-box model decisions. The framework is based on perturbations of the target input signal and applies to any differentiable pre-trained model such as neural networks. Our experiments demonstrate the framework's adaptability to diverse data modalities, particularly images, audio, and physical simulations of urban environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes