ROFeb 24, 2018

Visual Manipulation Relationship Network

arXiv:1802.08857v274 citations
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in robotics for tasks requiring ordered grasping in cluttered environments, but it is incremental as it builds on existing CNN-based methods.

The paper tackles the problem of robotic grasping in multi-object scenes by proposing a Visual Manipulation Relationship Network (VMRN) to detect objects and predict manipulation relationships in real time, achieving simultaneous detection and prediction as shown in experiments.

Robotic grasping detection is one of the most important fields in robotics, in which great progress has been made recent years with the help of convolutional neural network (CNN). However, including multiple objects in one scene can invalidate the existing CNN-based grasping detection algorithms, because manipulation relationships among objects are not considered, which are required to guide the robot to grasp things in the right order. This paper presents a new CNN architecture called Visual Manipulation Relationship Network (VMRN) to help robot detect targets and predict the manipulation relationships in real time. To implement end-to-end training and meet real-time requirements in robot tasks, we propose the Object Pairing Pooling Layer (OP2L) to help to predict all manipulation relationships in one forward process. Moreover, in order to train VMRN, we collect a dataset named Visual Manipulation Relationship Dataset (VMRD) consisting of 5185 images with more than 17000 object instances and the manipulation relationships between all possible pairs of objects in every image, which is labeled by the manipulation relationship tree. The experimental results show that the new network architecture can detect objects and predict manipulation relationships simultaneously and meet the real-time requirements in robot tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes