Sync-NeRF: Generalizing Dynamic NeRFs to Unsynchronized Videos
This addresses the challenge of dynamic scene reconstruction for applications like VR/AR and robotics, where video synchronization is often impractical, representing an incremental improvement over existing NeRF methods.
The paper tackles the problem of reconstructing dynamic scenes from unsynchronized multi-view videos using neural radiance fields (NeRF), which previously failed in such settings. By introducing time offsets for each video and jointly optimizing them with NeRF, the method improves various baselines with large margins, as validated on datasets like the Plenoptic Video Dataset and a new Unsynchronized Dynamic Blender Dataset.
Recent advancements in 4D scene reconstruction using neural radiance fields (NeRF) have demonstrated the ability to represent dynamic scenes from multi-view videos. However, they fail to reconstruct the dynamic scenes and struggle to fit even the training views in unsynchronized settings. It happens because they employ a single latent embedding for a frame while the multi-view images at the same frame were actually captured at different moments. To address this limitation, we introduce time offsets for individual unsynchronized videos and jointly optimize the offsets with NeRF. By design, our method is applicable for various baselines and improves them with large margins. Furthermore, finding the offsets naturally works as synchronizing the videos without manual effort. Experiments are conducted on the common Plenoptic Video Dataset and a newly built Unsynchronized Dynamic Blender Dataset to verify the performance of our method. Project page: https://seoha-kim.github.io/sync-nerf