RO CVMar 18, 2025

Foundation Feature-Driven Online End-Effector Pose Estimation: A Marker-Free and Learning-Free Approach

Tianshu Wu, Jiyao Zhang, Shiqian Liang, Zhengxiao Han, Hao Dong

arXiv:2503.14051v11 citationsh-index: 8ICRA

Originality Incremental advance

AI Analysis

This addresses the need for marker-free and learning-free online calibration in robotics, enabling cross-robot generalization without requiring full visibility, though it is incremental as it builds on existing foundation models and PnP algorithms.

The paper tackles the problem of online end-effector pose estimation without markers or training, proposing a method that uses pre-trained foundation model features to estimate 2D-3D correspondences and a pose optimization algorithm to handle partial observations and symmetry, achieving superior flexibility and generalization in experiments.

Accurate transformation estimation between camera space and robot space is essential. Traditional methods using markers for hand-eye calibration require offline image collection, limiting their suitability for online self-calibration. Recent learning-based robot pose estimation methods, while advancing online calibration, struggle with cross-robot generalization and require the robot to be fully visible. This work proposes a Foundation feature-driven online End-Effector Pose Estimation (FEEPE) algorithm, characterized by its training-free and cross end-effector generalization capabilities. Inspired by the zero-shot generalization capabilities of foundation models, FEEPE leverages pre-trained visual features to estimate 2D-3D correspondences derived from the CAD model and target image, enabling 6D pose estimation via the PnP algorithm. To resolve ambiguities from partial observations and symmetry, a multi-historical key frame enhanced pose optimization algorithm is introduced, utilizing temporal information for improved accuracy. Compared to traditional hand-eye calibration, FEEPE enables marker-free online calibration. Unlike robot pose estimation, it generalizes across robots and end-effectors in a training-free manner. Extensive experiments demonstrate its superior flexibility, generalization, and performance.

View on arXiv PDF

Similar