ROCVMar 6, 2025

Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

arXiv:2503.04308v3h-index: 17IROS
Originality Incremental advance
AI Analysis

This addresses a specific issue for robotic applications like bartending, where detection errors accumulate, but it is incremental as it builds on existing open-vocabulary methods with new data and labeling.

The paper tackles the problem of open-vocabulary object detectors failing to distinguish subclasses of glasses due to limited dataset variety, by introducing a novel real-world dataset and auto-labeling pipeline, resulting in a baseline model that outperforms state-of-the-art approaches and achieves an 81% success rate in a human-robot bartending scenario.

Datasets for object detection often do not account for enough variety of glasses, due to their transparent and reflective properties. Specifically, open-vocabulary object detectors, widely used in embodied robotic agents, fail to distinguish subclasses of glasses. This scientific gap poses an issue for robotic applications that suffer from accumulating errors between detection, planning, and action execution. This paper introduces a novel method for acquiring real-world data from RGB-D sensors that minimizes human effort. We propose an auto-labeling pipeline that generates labels for all the acquired frames based on the depth measurements. We provide a novel real-world glass object dataset GlassNICOLDataset that was collected on the Neuro-Inspired COLlaborator (NICOL), a humanoid robot platform. The dataset consists of 7850 images recorded from five different cameras. We show that our trained baseline model outperforms state-of-the-art open-vocabulary approaches. In addition, we deploy our baseline model in an embodied agent approach to the NICOL platform, on which it achieves a success rate of 81% in a human-robot bartending scenario.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes