Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies
This addresses the challenge of efficient robot learning for real-world applications by reducing reliance on large real-world datasets, though it is incremental as it builds on existing sim-to-real transfer methods.
The paper tackles the problem of costly real-world data collection for robot learning by proposing a modular deep reinforcement learning method for sim-to-real transfer of visuo-motor policies, achieving a fine-tuned accuracy of 1.6 pixels compared to 17.5 pixels with naive transfer.
While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models trained in simulation to a real-world robotic task. We introduce a bottleneck between perception and control, enabling the networks to be trained independently, but then merged and fine-tuned in an end-to-end manner to further improve hand-eye coordination. On a canonical, planar visually-guided robot reaching task a fine-tuned accuracy of 1.6 pixels is achieved, a significant improvement over naive transfer (17.5 pixels), showing the potential for more complicated and broader applications. Our method provides a technique for more efficient learning and transfer of visuo-motor policies for real robotic systems without relying entirely on large real-world robot datasets.