Planning with a Receding Horizon for Manipulation in Clutter using a Learned Value Function
This work addresses the challenge of transferring manipulation planning from simulation to the real world for robotics, though it appears incremental as it builds on existing sampling-based planners and reinforcement learning techniques.
The paper tackles the problem of manipulation in clutter by proposing a Receding Horizon Planner (RHP) that interleaves planning and execution in real-time, using a learned value function for efficient cost-to-go estimation. It shows that this approach enables effective reaction to uncertain real-world dynamics, with experiments conducted in simulation and real-world tasks.
Manipulation in clutter requires solving complex sequential decision making problems in an environment rich with physical interactions. The transfer of motion planning solutions from simulation to the real world, in open-loop, suffers from the inherent uncertainty in modelling real world physics. We propose interleaving planning and execution in real-time, in a closed-loop setting, using a Receding Horizon Planner (RHP) for pushing manipulation in clutter. In this context, we address the problem of finding a suitable value function based heuristic for efficient planning, and for estimating the cost-to-go from the horizon to the goal. We estimate such a value function first by using plans generated by an existing sampling-based planner. Then, we further optimize the value function through reinforcement learning. We evaluate our approach and compare it to state-of-the-art planning techniques for manipulation in clutter. We conduct experiments in simulation with artificially injected uncertainty on the physics parameters, as well as in real world tasks of manipulation in clutter. We show that this approach enables the robot to react to the uncertain dynamics of the real world effectively.