CVSep 3, 2024

A New People-Object Interaction Dataset and NVS Benchmarks

arXiv:2409.12980v11 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This provides a more comprehensive dataset for researchers studying human-object interactions, though it is incremental in improving data quality and scope.

The authors tackled the problem of limited data for human-object interaction research by introducing a new dataset with 38 series of 30-view multi-person or single-person RGB-D video sequences, captured at 4K resolution and 25 FPS. They also established benchmarks by evaluating state-of-the-art novel view synthesis models on this dataset.

Recently, NVS in human-object interaction scenes has received increasing attention. Existing human-object interaction datasets mainly consist of static data with limited views, offering only RGB images or videos, mostly containing interactions between a single person and objects. Moreover, these datasets exhibit complexities in lighting environments, poor synchronization, and low resolution, hindering high-quality human-object interaction studies. In this paper, we introduce a new people-object interaction dataset that comprises 38 series of 30-view multi-person or single-person RGB-D video sequences, accompanied by camera parameters, foreground masks, SMPL models, some point clouds, and mesh files. Video sequences are captured by 30 Kinect Azures, uniformly surrounding the scene, each in 4K resolution 25 FPS, and lasting for 1$\sim$19 seconds. Meanwhile, we evaluate some SOTA NVS models on our dataset to establish the NVS benchmarks. We hope our work can inspire further research in humanobject interaction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes