ML CR LGJun 1, 2019

Improving VAEs' Robustness to Adversarial Attack

Matthew Willetts, Alexander Camuto, Tom Rainforth, Stephen Roberts, Chris Holmes

arXiv:1906.00230v610.96 citations

Originality Incremental advance

AI Analysis

This addresses a security problem for users of VAEs in applications like image generation, though it is incremental as it builds on existing disentangling and hierarchical methods.

The paper tackles the vulnerability of variational autoencoders (VAEs) to adversarial attacks by introducing methods to enhance their robustness, achieving high-fidelity reconstructions while maintaining adversarial defense across multiple datasets and state-of-the-art attacks.

Variational autoencoders (VAEs) have recently been shown to be vulnerable to adversarial attacks, wherein they are fooled into reconstructing a chosen target image. However, how to defend against such attacks remains an open problem. We make significant advances in addressing this issue by introducing methods for producing adversarially robust VAEs. Namely, we first demonstrate that methods proposed to obtain disentangled latent representations produce VAEs that are more robust to these attacks. However, this robustness comes at the cost of reducing the quality of the reconstructions. We ameliorate this by applying disentangling methods to hierarchical VAEs. The resulting models produce high-fidelity autoencoders that are also adversarially robust. We confirm their capabilities on several different datasets and with current state-of-the-art VAE adversarial attacks, and also show that they increase the robustness of downstream tasks to attack.

View on arXiv PDF

Similar