CVAIJul 31, 2023

DiVa-360: The Dynamic Visual Dataset for Immersive Neural Fields

Stanford
arXiv:2307.16897v214 citationsh-index: 22
Originality Synthesis-oriented
AI Analysis

This addresses a dataset limitation for researchers in neural fields and 3D scene capture, though it is incremental as it focuses on data provision rather than algorithmic breakthroughs.

The authors tackled the lack of large-scale multi-view real-world datasets for dynamic neural fields by introducing DiVa-360, a dataset with 17.4 million image frames from 53 cameras, including object-centric, hand-object interaction, and long-duration sequences, and benchmarked state-of-the-art methods to provide insights.

Advances in neural fields are enabling high-fidelity capture of the shape and appearance of dynamic 3D scenes. However, their capabilities lag behind those offered by conventional representations such as 2D videos because of algorithmic challenges and the lack of large-scale multi-view real-world datasets. We address the dataset limitation with DiVa-360, a real-world 360 dynamic visual dataset that contains synchronized high-resolution and long-duration multi-view video sequences of table-scale scenes captured using a customized low-cost system with 53 cameras. It contains 21 object-centric sequences categorized by different motion types, 25 intricate hand-object interaction sequences, and 8 long-duration sequences for a total of 17.4 M image frames. In addition, we provide foreground-background segmentation masks, synchronized audio, and text descriptions. We benchmark the state-of-the-art dynamic neural field methods on DiVa-360 and provide insights about existing methods and future challenges on long-duration neural field capture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes