CVSep 27, 2019

RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis

arXiv:1909.12573v230 citations
Originality Incremental advance
AI Analysis

This addresses the problem of reducing annotation costs for 3D geometry understanding from images, though it appears incremental as it builds on existing GAN frameworks.

The paper tackles unsupervised 3D representation learning from 2D images by proposing RGBD-GAN, a generative model that enables camera-conditional image and depth generation without 3D annotations, achieving this through a 3D consistency loss and demonstrating effectiveness across various generator architectures.

Understanding three-dimensional (3D) geometries from two-dimensional (2D) images without any labeled information is promising for understanding the real world without incurring annotation cost. We herein propose a novel generative model, RGBD-GAN, which achieves unsupervised 3D representation learning from 2D images. The proposed method enables camera parameter-conditional image generation and depth image generation without any 3D annotations, such as camera poses or depth. We use an explicit 3D consistency loss for two RGBD images generated from different camera parameters, in addition to the ordinal GAN objective. The loss is simple yet effective for any type of image generator such as DCGAN and StyleGAN to be conditioned on camera parameters. Through experiments, we demonstrated that the proposed method could learn 3D representations from 2D images with various generator architectures.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes