IMD: A 6-DoF Pose Estimation Benchmark for Industrial Metallic Objects
This addresses the problem of limited model generalization for industrial robotics, but it is incremental as it primarily introduces a new dataset.
The authors tackled the lack of datasets for 6D pose estimation in industrial settings by creating the Industrial Metallic Dataset (IMD), which includes 45 metallic objects and shows it is more challenging than existing benchmarks.
Object 6DoF (6D) pose estimation is essential for robotic perception, especially in industrial settings. It enables robots to interact with the environment and manipulate objects. However, existing benchmarks on object 6D pose estimation primarily use everyday objects with rich textures and low-reflectivity, limiting model generalization to industrial scenarios where objects are often metallic, texture-less, and highly reflective. To address this gap, we propose a novel dataset and benchmark namely \textit{Industrial Metallic Dataset (IMD)}, tailored for industrial applications. Our dataset comprises 45 true-to-scale industrial components, captured with an RGB-D camera under natural indoor lighting and varied object arrangements to replicate real-world conditions. The benchmark supports three tasks, including video object segmentation, 6D pose tracking, and one-shot 6D pose estimation. We evaluate existing state-of-the-art models, including XMem and SAM2 for segmentation, and BundleTrack and BundleSDF for pose estimation, to assess model performance in industrial contexts. Evaluation results show that our industrial dataset is more challenging than existing household object datasets. This benchmark provides the baseline for developing and comparing segmentation and pose estimation algorithms that better generalize to industrial robotics scenarios.