CVROApr 5, 2019

HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

arXiv:1904.03167v2196 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses the problem of limited and scattered training data for researchers in computer vision, enabling more robust evaluation of 6D pose estimation methods, though it is incremental in expanding existing dataset options.

The authors tackled the lack of comprehensive datasets for 6D object pose estimation by creating HomebrewedDB, which includes 33 objects across 13 scenes and addresses challenges like training from 3D models, scalability, occlusions, and varying light conditions, setting a baseline with a DPOD detector.

Among the most important prerequisites for creating and evaluating 6D object pose detectors are datasets with labeled 6D poses. With the advent of deep learning, demand for such datasets is growing continuously. Despite the fact that some of exist, they are scarce and typically have restricted setups, such as a single object per sequence, or they focus on specific object types, such as textureless industrial parts. Besides, two significant components are often ignored: training using only available 3D models instead of real data and scalability, i.e. training one method to detect all objects rather than training one detector per object. Other challenges, such as occlusions, changing light conditions and changes in object appearance, as well precisely defined benchmarks are either not present or are scattered among different datasets. In this paper we present a dataset for 6D pose estimation that covers the above-mentioned challenges, mainly targeting training from 3D models (both textured and textureless), scalability, occlusions, and changes in light conditions and object appearance. The dataset features 33 objects (17 toy, 8 household and 8 industry-relevant objects) over 13 scenes of various difficulty. We also present a set of benchmarks to test various desired detector properties, particularly focusing on scalability with respect to the number of objects and resistance to changing light conditions, occlusions and clutter. We also set a baseline for the presented benchmarks using a state-of-the-art DPOD detector. Considering the difficulty of making such datasets, we plan to release the code allowing other researchers to extend this dataset or make their own datasets in the future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes