Light Field Neural Rendering
This addresses the problem of generating realistic novel views from limited input for applications in VR/AR and computer vision, representing an incremental advancement by integrating existing approaches.
The paper tackles novel view synthesis by combining light field rendering and geometric reconstruction to accurately model view-dependent effects like reflections from sparse views, achieving state-of-the-art performance on multiple datasets with larger improvements in scenes with severe variations.
Classical light field rendering for novel view synthesis can accurately reproduce view-dependent effects such as reflection, refraction, and translucency, but requires a dense view sampling of the scene. Methods based on geometric reconstruction need only sparse views, but cannot accurately model non-Lambertian effects. We introduce a model that combines the strengths and mitigates the limitations of these two directions. By operating on a four-dimensional representation of the light field, our model learns to represent view-dependent effects accurately. By enforcing geometric constraints during training and inference, the scene geometry is implicitly learned from a sparse set of views. Concretely, we introduce a two-stage transformer-based model that first aggregates features along epipolar lines, then aggregates features along reference views to produce the color of a target ray. Our model outperforms the state-of-the-art on multiple forward-facing and 360° datasets, with larger margins on scenes with severe view-dependent variations.