CVNov 27, 2023

Exploring Attribute Variations in Style-based GANs using Diffusion Models

arXiv:2311.16052v11 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses the limitation of existing methods that produce only single edits per attribute, offering more realistic and varied image modifications for users in computer vision applications.

The paper tackles the problem of diverse attribute editing in images by modeling attributes as multidimensional rather than binary, enabling multiple plausible edits per attribute, and demonstrates effectiveness across datasets and 3D face editing.

Existing attribute editing methods treat semantic attributes as binary, resulting in a single edit per attribute. However, attributes such as eyeglasses, smiles, or hairstyles exhibit a vast range of diversity. In this work, we formulate the task of \textit{diverse attribute editing} by modeling the multidimensional nature of attribute edits. This enables users to generate multiple plausible edits per attribute. We capitalize on disentangled latent spaces of pretrained GANs and train a Denoising Diffusion Probabilistic Model (DDPM) to learn the latent distribution for diverse edits. Specifically, we train DDPM over a dataset of edit latent directions obtained by embedding image pairs with a single attribute change. This leads to latent subspaces that enable diverse attribute editing. Applying diffusion in the highly compressed latent space allows us to model rich distributions of edits within limited computational resources. Through extensive qualitative and quantitative experiments conducted across a range of datasets, we demonstrate the effectiveness of our approach for diverse attribute editing. We also showcase the results of our method applied for 3D editing of various face attributes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes