CVOct 18, 2021

Learning multiplane images from single views with self-supervision

arXiv:2110.09380v21 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of view synthesis from single images for computer vision and graphics applications, offering a more generalizable solution by leveraging internet data, though it is incremental as it builds on existing multiplane image methods.

The paper tackles the problem of generating novel views from a single image with dynamic content by proposing CycleMPI, a self-supervised framework that learns multiplane image representations without stereo data, achieving results comparable to state-of-the-art methods in zero-shot scenarios on datasets like RealEstate10K and Mannequin Challenge.

Generating static novel views from an already captured image is a hard task in computer vision and graphics, in particular when the single input image has dynamic parts such as persons or moving objects. In this paper, we tackle this problem by proposing a new framework, called CycleMPI, that is capable of learning a multiplane image representation from single images through a cyclic training strategy for self-supervision. Our framework does not require stereo data for training, therefore it can be trained with massive visual data from the Internet, resulting in a better generalization capability even for very challenging cases. Although our method does not require stereo data for supervision, it reaches results on stereo datasets comparable to the state of the art in a zero-shot scenario. We evaluated our method on RealEstate10K and Mannequin Challenge datasets for view synthesis and presented qualitative results on Places II dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes