CVAILGJun 1, 2023

We never go out of Style: Motion Disentanglement by Subspace Decomposition of Latent Space

arXiv:2306.00559v11 citationsh-index: 57
Originality Incremental advance
AI Analysis

This work addresses motion disentanglement for video analysis and editing, offering a method that is efficient with minimal data but appears incremental as it builds on existing GAN frameworks.

The paper tackles the problem of decomposing complex motions in videos into independent components, such as facial expressions and head pose, by discovering disentangled motion subspaces in the latent space of pretrained style-based GAN models, achieving this with only about 10 ground truth video sequences and demonstrating applications in motion editing and selective transfer.

Real-world objects perform complex motions that involve multiple independent motion components. For example, while talking, a person continuously changes their expressions, head, and body pose. In this work, we propose a novel method to decompose motion in videos by using a pretrained image GAN model. We discover disentangled motion subspaces in the latent space of widely used style-based GAN models that are semantically meaningful and control a single explainable motion component. The proposed method uses only a few $(\approx10)$ ground truth video sequences to obtain such subspaces. We extensively evaluate the disentanglement properties of motion subspaces on face and car datasets, quantitatively and qualitatively. Further, we present results for multiple downstream tasks such as motion editing, and selective motion transfer, e.g. transferring only facial expressions without training for it.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes