LGGTOct 23, 2022

Nash Equilibria and Pitfalls of Adversarial Training in Adversarial Robustness Games

arXiv:2210.12606v313 citationsh-index: 60
Originality Incremental advance
AI Analysis

This addresses convergence issues in adversarial training for machine learning practitioners, but it is incremental as it builds on existing game-theoretic frameworks.

The paper tackles the problem of adversarial training's convergence in adversarial robustness games, proving that alternating best-response strategies may not converge even for a linear classifier, while a unique pure Nash equilibrium exists and is robust, with experimental support.

Adversarial training is a standard technique for training adversarially robust models. In this paper, we study adversarial training as an alternating best-response strategy in a 2-player zero-sum game. We prove that even in a simple scenario of a linear classifier and a statistical model that abstracts robust vs. non-robust features, the alternating best response strategy of such game may not converge. On the other hand, a unique pure Nash equilibrium of the game exists and is provably robust. We support our theoretical results with experiments, showing the non-convergence of adversarial training and the robustness of Nash equilibrium.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes