CVMay 15, 2021

Mask-Guided Discovery of Semantic Manifolds in Generative Models

arXiv:2105.07273v14 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable and controllable image generation in generative models, particularly for applications like animation, though it is incremental as it builds on existing GAN architectures.

The paper tackles the problem of entangled latent spaces in GANs like StyleGAN2, which lack meaningful control over image attributes such as facial expressions, by presenting a method to discover semantic manifolds for localized face regions, enabling smooth animations without requiring labeled data or altering model parameters.

Advances in the realm of Generative Adversarial Networks (GANs) have led to architectures capable of producing amazingly realistic images such as StyleGAN2, which, when trained on the FFHQ dataset, generates images of human faces from random vectors in a lower-dimensional latent space. Unfortunately, this space is entangled - translating a latent vector along its axes does not correspond to a meaningful transformation in the output space (e.g., smiling mouth, squinting eyes). The model behaves as a black box, providing neither control over its output nor insight into the structures it has learned from the data. We present a method to explore the manifolds of changes of spatially localized regions of the face. Our method discovers smoothly varying sequences of latent vectors along these manifolds suitable for creating animations. Unlike existing disentanglement methods that either require labelled data or explicitly alter internal model parameters, our method is an optimization-based approach guided by a custom loss function and manually defined region of change. Our code is open-sourced, which can be found, along with supplementary results, on our project page: https://github.com/bmolab/masked-gan-manifold

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes