Manipulation-Oriented Object Perception in Clutter through Affordance Coordinate Frames
This addresses the challenge of robust robot manipulation in unstructured environments for robotics applications, representing an incremental advancement.
The paper tackles the problem of enabling robots to generalize manipulation actions to novel objects in clutter by introducing the Affordance Coordinate Frame (ACF), which combines affordance and category-level pose representations, and demonstrates that ACF outperforms state-of-the-art methods in object detection and pose estimation.
In order to enable robust operation in unstructured environments, robots should be able to generalize manipulation actions to novel object instances. For example, to pour and serve a drink, a robot should be able to recognize novel containers which afford the task. Most importantly, robots should be able to manipulate these novel containers to fulfill the task. To achieve this, we aim to provide robust and generalized perception of object affordances and their associated manipulation poses for reliable manipulation. In this work, we combine the notions of affordance and category-level pose, and introduce the Affordance Coordinate Frame (ACF). With ACF, we represent each object class in terms of individual affordance parts and the compatibility between them, where each part is associated with a part category-level pose for robot manipulation. In our experiments, we demonstrate that ACF outperforms state-of-the-art methods for object detection, as well as category-level pose estimation for object parts. We further demonstrate the applicability of ACF to robot manipulation tasks through experiments in a simulated environment.