Affordance detection with Dynamic-Tree Capsule Networks
This addresses the problem of robust affordance detection for autonomous robots when handling unseen objects, representing an incremental improvement over existing convolutional neural network approaches.
The paper tackles affordance detection from 3D point clouds for robotic manipulation by introducing a dynamic tree-structured capsule network, which outperforms state-of-the-art models in viewpoint invariance and parts-segmentation on novel object instances.
Affordance detection from visual input is a fundamental step in autonomous robotic manipulation. Existing solutions to the problem of affordance detection rely on convolutional neural networks. However, these networks do not consider the spatial arrangement of the input data and miss parts-to-whole relationships. Therefore, they fall short when confronted with novel, previously unseen object instances or new viewpoints. One solution to overcome such limitations can be to resort to capsule networks. In this paper, we introduce the first affordance detection network based on dynamic tree-structured capsules for sparse 3D point clouds. We show that our capsule-based network outperforms current state-of-the-art models on viewpoint invariance and parts-segmentation of new object instances through a novel dataset we only used for evaluation and it is publicly available from github.com/gipfelen/DTCG-Net. In the experimental evaluation we will show that our algorithm is superior to current affordance detection methods when faced with grasping previously unseen objects thanks to our Capsule Network enforcing a parts-to-whole representation.