CVCRLGDec 19, 2019

Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

arXiv:1912.12170v4
Originality Incremental advance
AI Analysis

This addresses the security issue of adversarial attacks in image classification for machine learning practitioners, but it appears incremental as it builds on existing moving-average concepts.

The paper tackles the problem of defending against adversarial attacks with large perturbations by proposing a mitigation scheme that subtracts and adds estimated perturbations using moving-averaged samples, achieving high prediction accuracies even for perturbations greater than 16 on ImageNet with ResNet-50.

We propose the scheme that mitigates the adversarial perturbation $ε$ on the adversarial example $X_{adv}$ ($=$ $X$ $\pm$ $ε$, $X$ is a benign sample) by subtracting the estimated perturbation $\hatε$ from $X$ $+$ $ε$ and adding $\hatε$ to $X$ $-$ $ε$. The estimated perturbation $\hatε$ comes from the difference between $X_{adv}$ and its moving-averaged outcome $W_{avg}*X_{adv}$ where $W_{avg}$ is $N \times N$ moving average kernel that all the coefficients are one. Usually, the adjacent samples of an image are close to each other such that we can let $X$ $\approx$ $W_{avg}*X$ (naming this relation after X-MAS[X minus Moving Averaged Samples]). By doing that, we can make the estimated perturbation $\hatε$ falls within the range of $ε$. The scheme is also extended to do the multi-level mitigation by configuring the mitigated adversarial example $X_{adv}$ $\pm$ $\hatε$ as a new adversarial example to be mitigated. The multi-level mitigation gets $X_{adv}$ closer to $X$ with a smaller (i.e. mitigated) perturbation than original unmitigated perturbation by setting the moving averaged adversarial sample $W_{avg} * X_{adv}$ (which has the smaller perturbation than $X_{adv}$ if $X$ $\approx$ $W_{avg}*X$) as the boundary condition that the multi-level mitigation cannot cross over (i.e. decreasing $ε$ cannot go below and increasing $ε$ cannot go beyond). With the multi-level mitigation, we can get high prediction accuracies even in the adversarial example having a large perturbation (i.e. $ε$ $>$ $16$). The proposed scheme is evaluated with adversarial examples crafted by the FGSM (Fast Gradient Sign Method) based attacks on ResNet-50 trained with ImageNet dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes