LGCVMLMay 10, 2019

Interpreting and Evaluating Neural Network Robustness

arXiv:1905.04270v159 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better robustness evaluation in deep learning, offering a more reliable and cost-effective metric for researchers and practitioners, though it is incremental in nature.

The paper tackles the problem of evaluating neural network robustness against adversarial attacks by proposing a new metric based on prediction divergence, which outperforms conventional accuracy-based methods in uniformity, reliability, and efficiency.

Recently, adversarial deception becomes one of the most considerable threats to deep neural networks. However, compared to extensive research in new designs of various adversarial attacks and defenses, the neural networks' intrinsic robustness property is still lack of thorough investigation. This work aims to qualitatively interpret the adversarial attack and defense mechanism through loss visualization, and establish a quantitative metric to evaluate the neural network model's intrinsic robustness. The proposed robustness metric identifies the upper bound of a model's prediction divergence in the given domain and thus indicates whether the model can maintain a stable prediction. With extensive experiments, our metric demonstrates several advantages over conventional adversarial testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it over-performs conventional accuracy based robustness estimation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes