Toward Few-step Adversarial Training from a Frequency Perspective
This work addresses the computational cost of adversarial training for machine learning practitioners, but it is incremental as it builds on existing PGD methods.
The authors tackled the problem of reducing adversarial training overhead by developing Spectral Projected Gradient Descent (SPGD), a frequency-domain extension of PGD that achieves a better success rate in early steps and higher adversarial accuracy with fewer steps, though they proved it is equivalent to a variant of PGD without the sign function.
We investigate adversarial-sample generation methods from a frequency domain perspective and extend standard $l_{\infty}$ Projected Gradient Descent (PGD) to the frequency domain. The resulting method, which we call Spectral Projected Gradient Descent (SPGD), has better success rate compared to PGD during early steps of the method. Adversarially training models using SPGD achieves greater adversarial accuracy compared to PGD when holding the number of attack steps constant. The use of SPGD can, therefore, reduce the overhead of adversarial training when utilizing adversarial generation with a smaller number of steps. However, we also prove that SPGD is equivalent to a variant of the PGD ordinarily used for the $l_{\infty}$ threat model. This PGD variant omits the sign function which is ordinarily applied to the gradient. SPGD can, therefore, be performed without explicitly transforming into the frequency domain. Finally, we visualize the perturbations SPGD generates and find they use both high and low-frequency components, which suggests that removing either high-frequency components or low-frequency components is not an effective defense.