CVAICLROMar 21, 2018

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

arXiv:1803.07729v2223 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of generalizing robot navigation from synthetic to real-world settings, which is crucial for practical applications, though it builds incrementally on existing reinforcement learning techniques.

The paper tackles the problem of vision-and-language navigation in real-world environments by proposing a planned-ahead hybrid reinforcement learning model that combines model-free and model-based approaches. The method significantly outperforms baselines on the Room-to-Room dataset and shows improved generalizability to unseen environments.

Existing research studies on vision and language grounding for robot navigation focus on improving model-free deep reinforcement learning (DRL) models in synthetic environments. However, model-free DRL models do not consider the dynamics in the real-world environments, and they often fail to generalize to new scenes. In this paper, we take a radical approach to bridge the gap between synthetic studies and real-world practices---We propose a novel, planned-ahead hybrid reinforcement learning model that combines model-free and model-based reinforcement learning to solve a real-world vision-language navigation task. Our look-ahead module tightly integrates a look-ahead policy model with an environment model that predicts the next state and the reward. Experimental results suggest that our proposed method significantly outperforms the baselines and achieves the best on the real-world Room-to-Room dataset. Moreover, our scalable method is more generalizable when transferring to unseen environments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes