PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis
This addresses the challenge of generating large sets of plausible novel views in computer vision, with incremental improvements in multi-view consistency and quality.
The paper tackles the problem of generative novel view synthesis by proposing a set-based model that can simultaneously generate multiple self-consistent new views from any number of input views, outperforming state-of-the-art baselines on standard datasets and excelling at tasks like loops and binocular trajectories.
This paper considers the problem of generative novel view synthesis (GNVS), generating novel, plausible views of a scene given a limited number of known views. Here, we propose a set-based generative model that can simultaneously generate multiple, self-consistent new views, conditioned on any number of views. Our approach is not limited to generating a single image at a time and can condition on a variable number of views. As a result, when generating a large number of views, our method is not restricted to a low-order autoregressive generation approach and is better able to maintain generated image quality over large sets of images. We evaluate our model on standard NVS datasets and show that it outperforms the state-of-the-art image-based GNVS baselines. Further, we show that the model is capable of generating sets of views that have no natural sequential ordering, like loops and binocular trajectories, and significantly outperforms other methods on such tasks.