CVJul 23, 2023

LoLep: Single-View View Synthesis with Locally-Learned Planes and Self-Attention Occlusion Inference

arXiv:2307.12217v36 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses the problem of generating realistic novel views from single images for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles single-view view synthesis by regressing locally-learned planes from a single RGB image to generate novel views, achieving state-of-the-art results with an LPIPS reduction of 4.8%-9.0% and an RV reduction of 73.9%-83.5% compared to MINE.

We propose a novel method, LoLep, which regresses Locally-Learned planes from a single RGB image to represent scenes accurately, thus generating better novel views. Without the depth information, regressing appropriate plane locations is a challenging problem. To solve this issue, we pre-partition the disparity space into bins and design a disparity sampler to regress local offsets for multiple planes in each bin. However, only using such a sampler makes the network not convergent; we further propose two optimizing strategies that combine with different disparity distributions of datasets and propose an occlusion-aware reprojection loss as a simple yet effective geometric supervision technique. We also introduce a self-attention mechanism to improve occlusion inference and present a Block-Sampling Self-Attention (BS-SA) module to address the problem of applying self-attention to large feature maps. We demonstrate the effectiveness of our approach and generate state-of-the-art results on different datasets. Compared to MINE, our approach has an LPIPS reduction of 4.8%-9.0% and an RV reduction of 73.9%-83.5%. We also evaluate the performance on real-world images and demonstrate the benefits.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes