Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks
This addresses security vulnerabilities in generative models for researchers and practitioners, but it is incremental as it builds on existing adversarial attack frameworks.
The paper investigates how adversarial attacks can manipulate the latent codes of Variational Autoencoders (VAEs), demonstrating methods for both supervised and unsupervised attacks, and examines the robustness of modified VAEs like β-VAE and NVAE with proposed metrics.
In this work, we explore adversarial attacks on the Variational Autoencoders (VAE). We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack). We examine the influence of model modifications ($β$-VAE, NVAE) on the robustness of VAEs and suggest metrics to quantify it.