ROAILGMLAug 8, 2019

Vision-based Navigation Using Deep Reinforcement Learning

arXiv:1908.03627v268 citations
AI Analysis

This work addresses the problem of enabling mobile robots to navigate to visual targets in complex environments, representing an incremental improvement over existing goal-oriented visual navigation methods.

The paper tackles the challenge of applying deep reinforcement learning to visual navigation in realistic environments by proposing a novel learning architecture that extends the batched A2C algorithm with auxiliary tasks like segmentation and depth prediction, resulting in outperforming state-of-the-art methods on the AI2-THOR simulator.

Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image. To achieve this, we have extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance. We propose three additional auxiliary tasks: predicting the segmentation of the observation image and of the target image and predicting the depth-map. These tasks enable the use of supervised learning to pre-train a large part of the network and to reduce the number of training steps substantially. The training performance has been further improved by increasing the environment complexity gradually over time. An efficient neural network structure is proposed, which is capable of learning for multiple targets in multiple environments. Our method navigates in continuous state spaces and on the AI2-THOR environment simulator outperforms state-of-the-art goal-oriented visual navigation methods from the literature.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes