Learning Physics-Based Manipulation in Clutter: Combining Image-Based Generalization and Look-Ahead Planning
This addresses the challenge of multi-step robotic manipulation in complex, real-world settings, representing an incremental improvement by combining existing techniques for generalization and planning.
The paper tackles the problem of learning physics-based manipulation skills for robots in cluttered environments, enabling generalization over object types, shapes, and numbers through an image-based representation and look-ahead planning, with results demonstrated in simulated and real-world experiments.
Physics-based manipulation in clutter involves complex interaction between multiple objects. In this paper, we consider the problem of learning, from interaction in a physics simulator, manipulation skills to solve this multi-step sequential decision making problem in the real world. Our approach has two key properties: (i) the ability to generalize and transfer manipulation skills (over the type, shape, and number of objects in the scene) using an abstract image-based representation that enables a neural network to learn useful features; and (ii) the ability to perform look-ahead planning in the image space using a physics simulator, which is essential for such multi-step problems. We show, in sets of simulated and real-world experiments (video available on https://youtu.be/EmkUQfyvwkY), that by learning to evaluate actions in an abstract image-based representation of the real world, the robot can generalize and adapt to the object shapes in challenging real-world environments.