CVNov 22, 2022

$S^2$-Flow: Joint Semantic and Style Editing of Facial Images

arXiv:2211.12209v11 citationsh-index: 66
Originality Incremental advance
AI Analysis

This addresses the challenge for researchers and practitioners in computer vision who need precise image editing tools, though it is incremental as it builds on existing GAN-based editing methods.

The paper tackles the problem of entangled latent spaces in GANs that limit independent and detailed editing of facial images, proposing a method to disentangle semantic and style spaces, enabling controlled edits with quantitative and qualitative improvements.

The high-quality images yielded by generative adversarial networks (GANs) have motivated investigations into their application for image editing. However, GANs are often limited in the control they provide for performing specific edits. One of the principal challenges is the entangled latent space of GANs, which is not directly suitable for performing independent and detailed edits. Recent editing methods allow for either controlled style edits or controlled semantic edits. In addition, methods that use semantic masks to edit images have difficulty preserving the identity and are unable to perform controlled style edits. We propose a method to disentangle a GAN$\text{'}$s latent space into semantic and style spaces, enabling controlled semantic and style edits for face images independently within the same framework. To achieve this, we design an encoder-decoder based network architecture ($S^2$-Flow), which incorporates two proposed inductive biases. We show the suitability of $S^2$-Flow quantitatively and qualitatively by performing various semantic and style edits.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes