CVGRJun 23, 2022

UNeRF: Time and Memory Conscious U-Shaped Network for Training Neural Radiance Fields

arXiv:2206.11952v12 citationsh-index: 57
Originality Incremental advance
AI Analysis

This work addresses resource constraints in NeRF training, which is a bottleneck for applications like novel view synthesis and dynamic scene reconstruction, though it is incremental as it builds on existing NeRF methods.

The paper tackles the high training time and memory consumption of Neural Radiance Fields (NeRFs) by proposing UNeRF, a U-shaped network that shares computations across neighboring sample points, achieving reduced memory footprint, improved accuracy, and faster processing in both training and inference for static and dynamic scenes.

Neural Radiance Fields (NeRFs) increase reconstruction detail for novel view synthesis and scene reconstruction, with applications ranging from large static scenes to dynamic human motion. However, the increased resolution and model-free nature of such neural fields come at the cost of high training times and excessive memory requirements. Recent advances improve the inference time by using complementary data structures yet these methods are ill-suited for dynamic scenes and often increase memory consumption. Little has been done to reduce the resources required at training time. We propose a method to exploit the redundancy of NeRF's sample-based computations by partially sharing evaluations across neighboring sample points. Our UNeRF architecture is inspired by the UNet, where spatial resolution is reduced in the middle of the network and information is shared between adjacent samples. Although this change violates the strict and conscious separation of view-dependent appearance and view-independent density estimation in the NeRF method, we show that it improves novel view synthesis. We also introduce an alternative subsampling strategy which shares computation while minimizing any violation of view invariance. UNeRF is a plug-in module for the original NeRF network. Our major contributions include reduction of the memory footprint, improved accuracy, and reduced amortized processing time both during training and inference. With only weak assumptions on locality, we achieve improved resource utilization on a variety of neural radiance fields tasks. We demonstrate applications to the novel view synthesis of static scenes as well as dynamic human shape and motion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes