LGAIMay 17, 2021

How to Explain Neural Networks: an Approximation Perspective

arXiv:2105.07831v21 citations
Originality Incremental advance
AI Analysis

This addresses the interpretability bottleneck hindering AI adoption, though it appears incremental as it builds on existing approximation concepts.

The paper tackles the problem of neural network interpretability by developing an approximation theory framework, implementing it on fully connected networks and proposing MLPs as universal interpreters for black-box models, with extensive experiments demonstrating effectiveness.

The lack of interpretability has hindered the large-scale adoption of AI technologies. However, the fundamental idea of interpretability, as well as how to put it into practice, remains unclear. We provide notions of interpretability based on approximation theory in this study. We first implement this approximation interpretation on a specific model (fully connected neural network) and then propose to use MLP as a universal interpreter to explain arbitrary black-box models. Extensive experiments demonstrate the effectiveness of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes