A Note on Implementation Errors in Recent Adaptive Attacks Against Multi-Resolution Self-Ensembles
This work addresses a validation issue in adversarial machine learning research, highlighting the need for careful implementation to ensure accurate robustness evaluations.
The paper identifies an implementation error in recent adaptive attacks against a multi-resolution self-ensemble defense, where perturbations exceeded intended bounds by up to 20×, and shows that when properly constrained, the defense maintains non-trivial robustness.
This note documents an implementation issue in recent adaptive attacks (Zhang et al. [2024]) against the multi-resolution self-ensemble defense (Fort and Lakshminarayanan [2024]). The implementation allowed adversarial perturbations to exceed the standard $L_\infty = 8/255$ bound by up to a factor of 20$\times$, reaching magnitudes of up to $L_\infty = 160/255$. When attacks are properly constrained within the intended bounds, the defense maintains non-trivial robustness. Beyond highlighting the importance of careful validation in adversarial machine learning research, our analysis reveals an intriguing finding: properly bounded adaptive attacks against strong multi-resolution self-ensembles often align with human perception, suggesting the need to reconsider how we measure adversarial robustness.