Pre-training of Deep RL Agents for Improved Learning under Domain Randomization
This addresses a key bottleneck in sim-to-real transfer for robotics, offering an incremental improvement over existing methods.
The paper tackled the problem of domain randomization hindering policy training in reinforcement learning by proposing a pre-trained perception encoder to provide invariant embeddings, resulting in consistently improved performance on randomized tasks and successful zero-shot transfer to a physical robot.
Visual domain randomization in simulated environments is a widely used method to transfer policies trained in simulation to real robots. However, domain randomization and augmentation hamper the training of a policy. As reinforcement learning struggles with a noisy training signal, this additional nuisance can drastically impede training. For difficult tasks it can even result in complete failure to learn. To overcome this problem we propose to pre-train a perception encoder that already provides an embedding invariant to the randomization. We demonstrate that this yields consistently improved results on a randomized version of DeepMind control suite tasks and a stacking environment on arbitrary backgrounds with zero-shot transfer to a physical robot.