LGCROct 10, 2022

Certified Training: Small Boxes are All You Need

arXiv:2210.04871v270 citationsh-index: 64
AI Analysis

This work addresses the robustness-accuracy trade-off in adversarial machine learning, offering a novel approach for certified training that could benefit security-critical applications, though it appears incremental as it builds on existing certified defense methods.

The paper tackles the problem of achieving deterministic guarantees of adversarial robustness in neural networks by proposing SABR, a certified training method that propagates interval bounds for a small subset of adversarial inputs to approximate worst-case loss, resulting in outperforming existing defenses in standard and certifiable accuracies across datasets and perturbation magnitudes.

To obtain, deterministic guarantees of adversarial robustness, specialized training methods are used. We propose, SABR, a novel such certified training method, based on the key insight that propagating interval bounds for a small but carefully selected subset of the adversarial input region is sufficient to approximate the worst-case loss over the whole region while significantly reducing approximation errors. We show in an extensive empirical evaluation that SABR outperforms existing certified defenses in terms of both standard and certifiable accuracies across perturbation magnitudes and datasets, pointing to a new class of certified training methods promising to alleviate the robustness-accuracy trade-off.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes