CV MLJun 9, 2020

Towards an Intrinsic Definition of Robustness for a Classifier

Théo Giraudon, Vincent Gripon, Matthias Löwe, Franck Vermet

arXiv:2006.05095v21.2

Originality Incremental advance

AI Analysis

This addresses the need for reliable robustness metrics in machine learning, particularly for deep learning classifiers vulnerable to adversarial attacks, though it is incremental as it refines existing measurement approaches.

The paper tackles the problem of measuring classifier robustness by showing that averaging robustness radii is statistically weak and proposing a sample-weighted score that is independent of sample choice, demonstrated theoretically with logistic regression and empirically on deep networks and real datasets.

The robustness of classifiers has become a question of paramount importance in the past few years. Indeed, it has been shown that state-of-the-art deep learning architectures can easily be fooled with imperceptible changes to their inputs. Therefore, finding good measures of robustness of a trained classifier is a key issue in the field. In this paper, we point out that averaging the radius of robustness of samples in a validation set is a statistically weak measure. We propose instead to weight the importance of samples depending on their difficulty. We motivate the proposed score by a theoretical case study using logistic regression, where we show that the proposed score is independent of the choice of the samples it is evaluated upon. We also empirically demonstrate the ability of the proposed score to measure robustness of classifiers with little dependence on the choice of samples in more complex settings, including deep convolutional neural networks and real datasets.

View on arXiv PDF

Similar