CVMar 17, 2016

Generative Image Modeling using Style and Structure Adversarial Networks

arXiv:1603.05631v2635 citations
Originality Incremental advance
AI Analysis

This addresses image generation for computer vision by introducing a novel factorization approach, though it is incremental as it builds on existing GAN frameworks.

The paper tackles the problem of generating realistic images by factorizing the process into structure and style components, proposing S^2-GAN, which results in more realistic and interpretable images and enables unsupervised RGBD representation learning.

Current generative frameworks use end-to-end learning and generate images by sampling from uniform noise distribution. However, these approaches ignore the most basic principle of image formation: images are product of: (a) Structure: the underlying 3D model; (b) Style: the texture mapped onto structure. In this paper, we factorize the image generation process and propose Style and Structure Generative Adversarial Network (S^2-GAN). Our S^2-GAN has two components: the Structure-GAN generates a surface normal map; the Style-GAN takes the surface normal map as input and generates the 2D image. Apart from a real vs. generated loss function, we use an additional loss with computed surface normals from generated images. The two GANs are first trained independently, and then merged together via joint learning. We show our S^2-GAN model is interpretable, generates more realistic images and can be used to learn unsupervised RGBD representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes