ROCVOct 15, 2020

Pose Estimation for Robot Manipulators via Keypoint Optimization and Sim-to-Real Transfer

arXiv:2010.08054v351 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in robotic vision for tasks requiring visual feedback, offering an incremental improvement over existing deep learning methods.

The paper tackles the problem of uneven performance in keypoint detection for robotic manipulators by proposing an autonomous method to define optimal keypoint locations, using synthetic data and domain randomization to improve detection performance and enable real-world applications like calibration and pose estimation.

Keypoint detection is an essential building block for many robotic applications like motion capture and pose estimation. Historically, keypoints are detected using uniquely engineered markers such as checkerboards or fiducials. More recently, deep learning methods have been explored as they have the ability to detect user-defined keypoints in a marker-less manner. However, different manually selected keypoints can have uneven performance when it comes to detection and localization. An example of this can be found on symmetric robotic tools where DNN detectors cannot solve the correspondence problem correctly. In this work, we propose a new and autonomous way to define the keypoint locations that overcomes these challenges. The approach involves finding the optimal set of keypoints on robotic manipulators for robust visual detection and localization. Using a robotic simulator as a medium, our algorithm utilizes synthetic data for DNN training, and the proposed algorithm is used to optimize the selection of keypoints through an iterative approach. The results show that when using the optimized keypoints, the detection performance of the DNNs improved significantly. We further use the optimized keypoints for real robotic applications by using domain randomization to bridge the reality gap between the simulator and the physical world. The physical world experiments show how the proposed method can be applied to the wide-breadth of robotic applications that require visual feedback, such as camera-to-robot calibration, robotic tool tracking, and end-effector pose estimation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes