MLLGMar 22, 2018

Understanding Measures of Uncertainty for Adversarial Example Detection

arXiv:1803.08533v1412 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of adversarial example detection for machine learning security, but it is incremental as it builds on existing uncertainty measures without introducing a new paradigm.

The paper tackled the problem of detecting adversarial examples by analyzing different uncertainty measures, finding that mutual information is effective and highlighting failure modes of MC dropout, leading to a proposal for improved uncertainty estimates using probabilistic model ensembles, with illustrative experiments on MNIST and a Kaggle dataset.

Measuring uncertainty is a promising technique for detecting adversarial examples, crafted inputs on which the model predicts an incorrect class with high confidence. But many measures of uncertainty exist, including predictive en- tropy and mutual information, each capturing different types of uncertainty. We study these measures, and shed light on why mutual information seems to be effective at the task of adversarial example detection. We highlight failure modes for MC dropout, a widely used approach for estimating uncertainty in deep models. This leads to an improved understanding of the drawbacks of current methods, and a proposal to improve the quality of uncertainty estimates using probabilistic model ensembles. We give illustrative experiments using MNIST to demonstrate the intuition underlying the different measures of uncertainty, as well as experiments on a real world Kaggle dogs vs cats classification dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes