Existence and Minimax Theorems for Adversarial Surrogate Risks in Binary Classification
This work addresses the lack of theoretical foundations for adversarial training in binary classification, which is incremental as it extends known theorems to surrogate risks.
The paper tackles the theoretical understanding of adversarial training by proving existence, regularity, and minimax theorems for adversarial surrogate risks, explaining empirical observations and suggesting new algorithm directions.
Adversarial training is one of the most popular methods for training methods robust to adversarial attacks, however, it is not well-understood from a theoretical perspective. We prove and existence, regularity, and minimax theorems for adversarial surrogate risks. Our results explain some empirical observations on adversarial robustness from prior work and suggest new directions in algorithm development. Furthermore, our results extend previously known existence and minimax theorems for the adversarial classification risk to surrogate risks.