Multi-view Supervision for Single-view Reconstruction via Differentiable Ray Consistency
This work addresses 3D shape prediction from single images, which is important for computer vision applications, but it appears incremental as it builds on existing multi-view supervision methods.
The paper tackles the problem of single-view 3D reconstruction by proposing a differentiable ray consistency term to enforce multi-view supervision, improving performance over existing techniques on the PASCAL VOC dataset.
We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e.g. foreground masks, depth, color images, semantics etc. as supervision for learning single-view 3D prediction. We present empirical analysis of our technique in a controlled setting. We also show that this approach allows us to improve over existing techniques for single-view reconstruction of objects from the PASCAL VOC dataset.