RO CVSep 14, 2019

Deep Robotic Prediction with hierarchical RGB-D Fusion

Yaoxian Song, Jun Wen, Yuejiao Fei, Changbin Yu

arXiv:1909.06585v26.24 citations

Originality Incremental advance

AI Analysis

This addresses robotic control tasks for grasping in varied environments, but appears incremental as it builds on existing RGB-D and 3D methods.

The paper tackles robotic arm grasping by proposing a real-time multimodal hierarchical encoder-decoder neural network that fuses RGB and depth data for humanoid grasping in 3D space with partial observation, achieving over 90% success rate in both table surface and 3D space scenarios.

Robotic arm grasping is a fundamental operation in robotic control task goals. Most current methods for robotic grasping focus on RGB-D policy in the table surface scenario or 3D point cloud analysis and inference in the 3D space. Comparing to these methods, we propose a novel real-time multimodal hierarchical encoder-decoder neural network that fuses RGB and depth data to realize robotic humanoid grasping in 3D space with only partial observation. The quantification of raw depth data's uncertainty and depth estimation fusing RGB is considered. We develop a general labeling method to label ground-truth on common RGB-D datasets. We evaluate the effectiveness and performance of our method on a physical robot setup and our method achieves over 90\% success rate in both table surface and 3D space scenarios.

View on arXiv PDF

Similar