ROJun 9

UMI-Bench 1.0: An Open and Reproducible Real-World Benchmark for Tabletop Robotic Manipulation with UMI Data

Shi Jin, Yuntian Wang, Yuhui Duan, Di Wu, Gaoqi Dong, Xiaohang Liu, Xiaotong Li, Hongfei Jia, Zehao Zhang, Tianyu Wang, Zhongjie Jia, Yuanqi Yao
arXiv:2606.10382v117.1
Predicted impact top 8% in RO · last 90 daysOriginality Synthesis-oriented
AI Analysis

For researchers developing UMI-based robotic manipulation policies, this benchmark offers a standardized testbed to measure generalization in real physical environments, addressing the lack of dedicated evaluation frameworks in this domain.

UMI-Bench 1.0 introduces the first real-world benchmark for evaluating UMI-style manipulation policies, providing a standardized protocol for data collection, scene reset, policy execution, and result logging to enable reproducible and auditable evaluation.

Real-robot evaluation is essential for understanding whether learned manipulation policies can operate reliably outside curated demonstrations. This need is particularly pressing for Universal Manipulation Interface (UMI)-style policies, whose performance depends on the coupling between wrist-view observations, action representation, data collection, and physical deployment. Existing real-world benchmarks have made important progress, but they are not designed around this UMI data-to-deployment setting. We present UMI-Bench 1.0, a local-first real-robot benchmark for standardized evaluation of UMI-style manipulation policies. To the best of our knowledge, this is the first benchmark dedicated to real-world evaluation of UMI-based manipulation models. UMI-Bench aligns data collection, scene reset, policy execution, result logging, and task-factor analysis within a unified protocol. By making the full evaluation process reproducible and auditable, UMI-Bench provides a practical testbed for measuring how UMI-trained policies generalize to real physical manipulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes