CVApr 21, 2019

TransGaGa: Geometry-Aware Unsupervised Image-to-Image Translation

arXiv:1904.09571v1114 citations
Originality Highly original
AI Analysis

This addresses the challenge of translating images with complex geometry for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles the problem of unsupervised image-to-image translation across large geometry variations, which often fails, by proposing a disentangle-and-translate framework that separates appearance and geometry latent spaces, resulting in superior performance in near-rigid and non-rigid translation tasks compared to state-of-the-art methods.

Unsupervised image-to-image translation aims at learning a mapping between two visual domains. However, learning a translation across large geometry variations always ends up with failure. In this work, we present a novel disentangle-and-translate framework to tackle the complex objects image-to-image translation task. Instead of learning the mapping on the image space directly, we disentangle image space into a Cartesian product of the appearance and the geometry latent spaces. Specifically, we first introduce a geometry prior loss and a conditional VAE loss to encourage the network to learn independent but complementary representations. The translation is then built on appearance and geometry space separately. Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks. In addition, by taking different exemplars as the appearance references, our method also supports multimodal translation. Project page: https://wywu.github.io/projects/TGaGa/TGaGa.html

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes