GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds
This addresses the challenge of realistic 3D world synthesis for applications in gaming and virtual environments, representing a novel task with incremental advancements in unsupervised rendering.
The paper tackles the problem of generating photorealistic images from 3D block worlds like Minecraft without paired real images, achieving this through an unsupervised neural rendering framework that enables user control over camera, semantics, and style.
We present GANcraft, an unsupervised neural rendering framework for generating photorealistic images of large 3D block worlds such as those created in Minecraft. Our method takes a semantic block world as input, where each block is assigned a semantic label such as dirt, grass, or water. We represent the world as a continuous volumetric function and train our model to render view-consistent photorealistic images for a user-controlled camera. In the absence of paired ground truth real images for the block world, we devise a training technique based on pseudo-ground truth and adversarial training. This stands in contrast to prior work on neural rendering for view synthesis, which requires ground truth images to estimate scene geometry and view-dependent appearance. In addition to camera trajectory, GANcraft allows user control over both scene semantics and output style. Experimental results with comparison to strong baselines show the effectiveness of GANcraft on this novel task of photorealistic 3D block world synthesis. The project website is available at https://nvlabs.github.io/GANcraft/ .