CVOct 12, 2023

Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes

arXiv:2310.08585v169 citationsh-index: 37
Originality Incremental advance
AI Analysis

This addresses the problem of high-fidelity, real-time novel view synthesis for dynamic scenes, which is important for applications like VR/AR and film production, representing a hybrid approach rather than a paradigm shift.

The paper tackles dynamic view synthesis from multi-view videos by introducing Im4D, a hybrid representation combining grid-based geometry and image-based appearance, achieving state-of-the-art rendering quality and real-time performance at 79.8 FPS for 512x512 images.

This paper aims to tackle the challenge of dynamic view synthesis from multi-view videos. The key observation is that while previous grid-based methods offer consistent rendering, they fall short in capturing appearance details of a complex dynamic scene, a domain where multi-view image-based rendering methods demonstrate the opposite properties. To combine the best of two worlds, we introduce Im4D, a hybrid scene representation that consists of a grid-based geometry representation and a multi-view image-based appearance representation. Specifically, the dynamic geometry is encoded as a 4D density function composed of spatiotemporal feature planes and a small MLP network, which globally models the scene structure and facilitates the rendering consistency. We represent the scene appearance by the original multi-view videos and a network that learns to predict the color of a 3D point from image features, instead of memorizing detailed appearance totally with networks, thereby naturally making the learning of networks easier. Our method is evaluated on five dynamic view synthesis datasets including DyNeRF, ZJU-MoCap, NHR, DNA-Rendering and ENeRF-Outdoor datasets. The results show that Im4D exhibits state-of-the-art performance in rendering quality and can be trained efficiently, while realizing real-time rendering with a speed of 79.8 FPS for 512x512 images, on a single RTX 3090 GPU.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes