Towards Consistency in Adversarial Classification
This addresses a foundational issue in adversarial machine learning for researchers, but it is incremental as it provides theoretical insights without new empirical results.
The paper tackles the problem of whether surrogate losses can be used as proxies for minimizing the 0/1 loss in adversarial classification, showing that no convex surrogate loss is consistent or calibrated in this context, and identifies conditions for calibration in both adversarial and standard settings.
In this paper, we study the problem of consistency in the context of adversarial examples. Specifically, we tackle the following question: can surrogate losses still be used as a proxy for minimizing the $0/1$ loss in the presence of an adversary that alters the inputs at test-time? Different from the standard classification task, this question cannot be reduced to a point-wise minimization problem, and calibration needs not to be sufficient to ensure consistency. In this paper, we expose some pathological behaviors specific to the adversarial problem, and show that no convex surrogate loss can be consistent or calibrated in this context. It is therefore necessary to design another class of surrogate functions that can be used to solve the adversarial consistency issue. As a first step towards designing such a class, we identify sufficient and necessary conditions for a surrogate loss to be calibrated in both the adversarial and standard settings. Finally, we give some directions for building a class of losses that could be consistent in the adversarial framework.