CVLGAug 18, 2022

Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance

arXiv:2208.08664v250 citationsh-index: 98
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in diffusion-based image synthesis for researchers and practitioners, offering an incremental improvement over existing guidance methods.

The paper tackled the problem of unreliable gradients from traditional classifiers in class-conditional diffusion models, which can hinder image generation, by using an adversarially robust classifier for guidance, resulting in improved generation metrics and human preference in experiments on ImageNet.

Denoising diffusion probabilistic models (DDPMs) are a recent family of generative models that achieve state-of-the-art results. In order to obtain class-conditional generation, it was suggested to guide the diffusion process by gradients from a time-dependent classifier. While the idea is theoretically sound, deep learning-based classifiers are infamously susceptible to gradient-based adversarial attacks. Therefore, while traditional classifiers may achieve good accuracy scores, their gradients are possibly unreliable and might hinder the improvement of the generation results. Recent work discovered that adversarially robust classifiers exhibit gradients that are aligned with human perception, and these could better guide a generative process towards semantically meaningful images. We utilize this observation by defining and training a time-dependent adversarially robust classifier and use it as guidance for a generative diffusion model. In experiments on the highly challenging and diverse ImageNet dataset, our scheme introduces significantly more intelligible intermediate gradients, better alignment with theoretical findings, as well as improved generation results under several evaluation metrics. Furthermore, we conduct an opinion survey whose findings indicate that human raters prefer our method's results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes