CVSep 18, 2017

Matterport3D: Learning from RGB-D Data in Indoor Environments

arXiv:1709.06158v12479 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses a critical need for researchers in computer vision working on indoor environment analysis, though it is incremental as it builds on existing RGB-D data collection efforts.

The authors tackled the lack of large-scale RGB-D datasets for indoor scene understanding by introducing Matterport3D, a dataset with 10,800 panoramic views from 194,400 images across 90 building-scale scenes, providing annotations for tasks like semantic segmentation and keypoint matching.

Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms. However, existing datasets still cover only a limited number of views or a restricted scale of spaces. In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D and 3D semantic segmentations. The precise global alignment and comprehensive, diverse panoramic set of views over entire buildings enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes