CVMar 27, 2020

Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

arXiv:2003.12649v137 citations
AI Analysis

This addresses the challenge of synthetic-to-real translation for computer vision applications, offering a domain-specific improvement over existing methods.

The paper tackles the problem of improving the visual realism of low-quality synthetic images, such as OpenGL renderings, by proposing a semi-supervised method that disentangles shading and albedo layers, resulting in more realistic images and better performance in tasks like depth and normal prediction compared to state-of-the-art approaches.

We present a method to improve the visual realism of low-quality, synthetic images, e.g. OpenGL renderings. Training an unpaired synthetic-to-real translation network in image space is severely under-constrained and produces visible artifacts. Instead, we propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image. Our two-stage pipeline first learns to predict accurate shading in a supervised fashion using physically-based renderings as targets, and further increases the realism of the textures and shading with an improved CycleGAN network. Extensive evaluations on the SUNCG indoor scene dataset demonstrate that our approach yields more realistic images compared to other state-of-the-art approaches. Furthermore, networks trained on our generated "real" images predict more accurate depth and normals than domain adaptation approaches, suggesting that improving the visual realism of the images can be more effective than imposing task-specific losses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes