Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences
This addresses a practical issue for UAV-based 3D reconstruction by enabling effective view synthesis from unsynchronized data, though it is incremental as it builds on existing depth-regularized radiance field methods.
The paper tackles the problem of learning radiance fields from asynchronous RGB-D sequences, which is common in UAV city modeling, by proposing a novel time-pose function and joint optimization scheme, resulting in outperforming baselines without regularization and showing improved results on real-world drone data.
It has been shown that learning radiance fields with depth rendering and depth supervision can effectively promote the quality and convergence of view synthesis. However, this paradigm requires input RGB-D sequences to be synchronized, hindering its usage in the UAV city modeling scenario. As there exists asynchrony between RGB images and depth images due to high-speed flight, we propose a novel time-pose function, which is an implicit network that maps timestamps to $\rm SE(3)$ elements. To simplify the training process, we also design a joint optimization scheme to jointly learn the large-scale depth-regularized radiance fields and the time-pose function. Our algorithm consists of three steps: (1) time-pose function fitting, (2) radiance field bootstrapping, (3) joint pose error compensation and radiance field refinement. In addition, we propose a large synthetic dataset with diverse controlled mismatches and ground truth to evaluate this new problem setting systematically. Through extensive experiments, we demonstrate that our method outperforms baselines without regularization. We also show qualitatively improved results on a real-world asynchronous RGB-D sequence captured by drone. Codes, data, and models will be made publicly available.