PUREVQ-GAN: Defending Data Poisoning Attacks through Vector-Quantized Bottlenecks
This addresses the problem of data poisoning for machine learning security, offering a faster defense compared to existing methods.
The paper tackles defending against data poisoning attacks by introducing PureVQ-GAN, which uses a vector-quantized bottleneck to destroy backdoor triggers while preserving image semantics, achieving 0% poison success rate on some attacks and 91-95% clean accuracy on CIFAR-10.
We introduce PureVQ-GAN, a defense against data poisoning that forces backdoor triggers through a discrete bottleneck using Vector-Quantized VAE with GAN discriminator. By quantizing poisoned images through a learned codebook, PureVQ-GAN destroys fine-grained trigger patterns while preserving semantic content. A GAN discriminator ensures outputs match the natural image distribution, preventing reconstruction of out-of-distribution perturbations. On CIFAR-10, PureVQ-GAN achieves 0% poison success rate (PSR) against Gradient Matching and Bullseye Polytope attacks, and 1.64% against Narcissus while maintaining 91-95% clean accuracy. Unlike diffusion-based defenses requiring hundreds of iterative refinement steps, PureVQ-GAN is over 50x faster, making it practical for real training pipelines.