LG AI ITOct 12, 2021

A Rate-Distortion Framework for Explaining Black-box Model Decisions

Stefan Kolek, Duc Anh Nguyen, Ron Levie, Joan Bruna, Gitta Kutyniok

arXiv:2110.08252v111.316 citations

Originality Incremental advance

AI Analysis

This addresses the need for interpretability in AI for users dealing with complex models, though it appears incremental as it builds on existing perturbation-based explanation methods.

The paper tackles the problem of explaining black-box model decisions by introducing the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method based on input perturbations applicable to any differentiable pre-trained model, with experiments showing adaptability across images, audio, and urban simulations.

We present the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method for explaining black-box model decisions. The framework is based on perturbations of the target input signal and applies to any differentiable pre-trained model such as neural networks. Our experiments demonstrate the framework's adaptability to diverse data modalities, particularly images, audio, and physical simulations of urban environments.

View on arXiv PDF

Similar