LGMLJan 9, 2020

Sampling Prediction-Matching Examples in Neural Networks: A Probabilistic Programming Approach

arXiv:2001.03076v12 citations
AI Analysis

This work addresses interpretability for neural network users by providing a way to sample examples that match specific predictions, though it is incremental as it builds on existing probabilistic programming techniques.

The paper tackles the problem of understanding how neural network classifiers make individual predictions by exploring prediction level sets using probabilistic programming, and demonstrates the method on synthetic and MNIST datasets to generate examples with specified predictions.

Though neural network models demonstrate impressive performance, we do not understand exactly how these black-box models make individual predictions. This drawback has led to substantial research devoted to understand these models in areas such as robustness, interpretability, and generalization ability. In this paper, we consider the problem of exploring the prediction level sets of a classifier using probabilistic programming. We define a prediction level set to be the set of examples for which the predictor has the same specified prediction confidence with respect to some arbitrary data distribution. Notably, our sampling-based method does not require the classifier to be differentiable, making it compatible with arbitrary classifiers. As a specific instantiation, if we take the classifier to be a neural network and the data distribution to be that of the training data, we can obtain examples that will result in specified predictions by the neural network. We demonstrate this technique with experiments on a synthetic dataset and MNIST. Such level sets in classification may facilitate human understanding of classification behaviors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes