Learning visual servo policies via planner cloning
This addresses the challenge of efficient visual servoing control for robotics, though it appears incremental as it builds on existing behavior cloning and planning methods.
The paper tackles the problem of slow model-free policy learning for visual servoing in novel environments by proposing planner cloning, where policies mimic a full-state motion planner in simulation. They introduce Penalized Q Cloning (PQC), which outperforms baselines and achieves about 87% success rate in both simulation and real robot transfer.
Learning control policies for visual servoing in novel environments is an important problem. However, standard model-free policy learning methods are slow. This paper explores planner cloning: using behavior cloning to learn policies that mimic the behavior of a full-state motion planner in simulation. We propose Penalized Q Cloning (PQC), a new behavior cloning algorithm. We show that it outperforms several baselines and ablations on some challenging problems involving visual servoing in novel environments while avoiding obstacles. Finally, we demonstrate that these policies can be transferred effectively onto a real robotic platform, achieving approximately an 87% success rate both in simulation and on a real robot.