CVAIGRHCROJun 1, 2023

The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects

MITStanford
arXiv:2306.00956v164 citationsh-index: 142
Originality Synthesis-oriented
AI Analysis

This provides a new benchmark and dataset to enable research in multisensory object-centric learning for computer vision and robotics, though it is incremental as it builds upon existing multisensory concepts.

The authors introduced the ObjectFolder Benchmark, a suite of 10 tasks for multisensory object-centric learning using sight, sound, and touch, and the ObjectFolder Real dataset with measurements for 100 real-world objects, showing the importance of multisensory perception in tasks like recognition and manipulation.

We introduce the ObjectFolder Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning, centered around object recognition, reconstruction, and manipulation with sight, sound, and touch. We also introduce the ObjectFolder Real dataset, including the multisensory measurements for 100 real-world household objects, building upon a newly designed pipeline for collecting the 3D meshes, videos, impact sounds, and tactile readings of real-world objects. We conduct systematic benchmarking on both the 1,000 multisensory neural objects from ObjectFolder, and the real multisensory data from ObjectFolder Real. Our results demonstrate the importance of multisensory perception and reveal the respective roles of vision, audio, and touch for different object-centric learning tasks. By publicly releasing our dataset and benchmark suite, we hope to catalyze and enable new research in multisensory object-centric learning in computer vision, robotics, and beyond. Project page: https://objectfolder.stanford.edu

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes