LGMay 8, 2023

TAPS: Connecting Certified and Adversarial Training

arXiv:2305.04574v214 citationsHas Code
AI Analysis

This addresses the challenge of balancing certified robustness and standard accuracy in adversarial training for machine learning security, representing an incremental improvement over existing methods.

The paper tackles the problem of training certifiably robust neural networks by proposing TAPS, a method that combines IBP and PGD training to reduce over-regularization, achieving a new state-of-the-art with a certified accuracy of 22% on TinyImageNet for ℓ∞-perturbations with radius ε=1/255.

Training certifiably robust neural networks remains a notoriously hard problem. On one side, adversarial training optimizes under-approximations of the worst-case loss, which leads to insufficient regularization for certification, while on the other, sound certified training methods optimize loose over-approximations, leading to over-regularization and poor (standard) accuracy. In this work we propose TAPS, an (unsound) certified training method that combines IBP and PGD training to yield precise, although not necessarily sound, worst-case loss approximations, reducing over-regularization and increasing certified and standard accuracies. Empirically, TAPS achieves a new state-of-the-art in many settings, e.g., reaching a certified accuracy of $22\%$ on TinyImageNet for $\ell_\infty$-perturbations with radius $ε=1/255$. We make our implementation and networks public at https://github.com/eth-sri/taps.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes