LGMar 18, 2022

Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

Anna Kuzina, Max Welling, Jakub M. Tomczak

arXiv:2203.09940v211.815 citationsh-index: 94Has Code

Originality Incremental advance

AI Analysis

This addresses a security issue for users of VAEs in applications like classification, though it is incremental as it builds on prior attack methods.

The paper tackles the vulnerability of variational autoencoders (VAEs) to adversarial attacks that manipulate latent representations and reconstructions, and presents a solution using MCMC in the inference step that improves robustness across multiple datasets and VAE configurations without degrading performance on non-attacked inputs.

Variational autoencoders (VAEs) are latent variable models that can generate complex objects and provide meaningful latent representations. Moreover, they could be further used in downstream tasks such as classification. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine several objective functions for adversarial attack construction proposed previously and present a solution to alleviate the effect of these attacks. Our method utilizes the Markov Chain Monte Carlo (MCMC) technique in the inference step that we motivate with a theoretical analysis. Thus, we do not incorporate any extra costs during training, and the performance on non-attacked inputs is not decreased. We validate our approach on a variety of datasets (MNIST, Fashion MNIST, Color MNIST, CelebA) and VAE configurations ($β$-VAE, NVAE, $β$-TCVAE), and show that our approach consistently improves the model robustness to adversarial attacks.

View on arXiv PDF Code

Similar