CVMar 8, 2017

Transformation-Grounded Image Generation Network for Novel 3D View Synthesis

arXiv:1703.02921v1290 citations
Originality Incremental advance
AI Analysis

This addresses the problem of generating realistic novel views from limited input for applications in computer vision and graphics, representing an incremental improvement over prior methods.

The paper tackles novel 3D view synthesis from a single image by using a transformation-grounded network that predicts flow and visibility maps to handle geometry, then hallucinates missing parts, resulting in reduced artifacts and better qualitative and quantitative results compared to existing methods.

We present a transformation-grounded image generation network for novel 3D view synthesis from a single image. Instead of taking a 'blank slate' approach, we first explicitly infer the parts of the geometry visible both in the input and novel views and then re-cast the remaining synthesis problem as image completion. Specifically, we both predict a flow to move the pixels from the input to the novel view along with a novel visibility map that helps deal with occulsion/disocculsion. Next, conditioned on those intermediate results, we hallucinate (infer) parts of the object invisible in the input image. In addition to the new network structure, training with a combination of adversarial and perceptual loss results in a reduction in common artifacts of novel view synthesis such as distortions and holes, while successfully generating high frequency details and preserving visual aspects of the input image. We evaluate our approach on a wide range of synthetic and real examples. Both qualitative and quantitative results show our method achieves significantly better results compared to existing methods.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes