RO CV LGJul 1, 2025

Box Pose and Shape Estimation and Domain Adaptation for Large-Scale Warehouse Automation

Xihang Yu, Rajat Talak, Jingnan Shi, Ulrich Viereck, Igor Gilitschenski, Luca Carlone

arXiv:2507.00984v15.71 citationsh-index: 18

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving perception models for warehouse robots without manual annotations, which is incremental as it builds on existing domain adaptation methods.

The paper tackles the problem of estimating box pose and shape in warehouse automation by developing a self-supervised domain adaptation pipeline that uses unlabeled real-world data, resulting in significant performance improvements over simulation-only and zero-shot baselines.

Modern warehouse automation systems rely on fleets of intelligent robots that generate vast amounts of data -- most of which remains unannotated. This paper develops a self-supervised domain adaptation pipeline that leverages real-world, unlabeled data to improve perception models without requiring manual annotations. Our work focuses specifically on estimating the pose and shape of boxes and presents a correct-and-certify pipeline for self-supervised box pose and shape estimation. We extensively evaluate our approach across a range of simulated and real industrial settings, including adaptation to a large-scale real-world dataset of 50,000 images. The self-supervised model significantly outperforms models trained solely in simulation and shows substantial improvements over a zero-shot 3D bounding box estimation baseline.

View on arXiv PDF

Similar