ROOct 26, 2017

DoShiCo Challenge: Domain Shift in Control Prediction

arXiv:1710.09860v26.74 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of deploying deep reinforcement learning in real-world applications by focusing on domain shift, though it is incremental as it builds on existing benchmarks by combining domain shift with simulated environments.

The paper tackles the problem of domain shift in deep reinforcement learning for real-world applications by proposing the DoShiCo challenge, which trains policies in basic synthetic environments to perform collision avoidance for drones in realistic settings, achieving flight without collisions in a different simulated environment.

Training deep neural network policies end-to-end for real-world applications so far requires big demonstration datasets in the real world or big sets consisting of a large variety of realistic and closely related 3D CAD models. These real or virtual data should, moreover, have very similar characteristics to the conditions expected at test time. These stringent requirements and the time consuming data collection processes that they entail, are currently the most important impediment that keeps deep reinforcement learning from being deployed in real-world applications. Therefore, in this work we advocate an alternative approach, where instead of avoiding any domain shift by carefully selecting the training data, the goal is to learn a policy that can cope with it. To this end, we propose the DoShiCo challenge: to train a model in very basic synthetic environments, far from realistic, in a way that it can be applied in more realistic environments as well as take the control decisions on real-world data. In particular, we focus on the task of collision avoidance for drones. We created a set of simulated environments that can be used as benchmark and implemented a baseline method, exploiting depth prediction as an auxiliary task to help overcome the domain shift. Even though the policy is trained in very basic environments, it can learn to fly without collisions in a very different realistic simulated environment. Of course several benchmarks for reinforcement learning already exist - but they never include a large domain shift. On the other hand, several benchmarks in computer vision focus on the domain shift, but they take the form of a static datasets instead of simulated environments. In this work we claim that it is crucial to take the two challenges together in one benchmark.

View on arXiv PDF

Similar