RenderNet: A deep convolutional network for differentiable rendering from 3D shapes
This work addresses a challenge in computer graphics for researchers and practitioners needing differentiable rendering for inverse tasks, representing a novel method for a known bottleneck.
The paper tackles the problem of non-differentiability in traditional rendering pipelines, which hinders inverse rendering tasks, by introducing RenderNet, a deep convolutional network that learns to render 2D images from 3D shapes and successfully enables estimation of shape, pose, lighting, and texture from a single image.
Traditional computer graphics rendering pipeline is designed for procedurally generating 2D quality images from 3D shapes with high performance. The non-differentiability due to discrete operations such as visibility computation makes it hard to explicitly correlate rendering parameters and the resulting image, posing a significant challenge for inverse rendering tasks. Recent work on differentiable rendering achieves differentiability either by designing surrogate gradients for non-differentiable operations or via an approximate but differentiable renderer. These methods, however, are still limited when it comes to handling occlusion, and restricted to particular rendering effects. We present RenderNet, a differentiable rendering convolutional network with a novel projection unit that can render 2D images from 3D shapes. Spatial occlusion and shading calculation are automatically encoded in the network. Our experiments show that RenderNet can successfully learn to implement different shaders, and can be used in inverse rendering tasks to estimate shape, pose, lighting and texture from a single image.