AICLCVLGOct 26, 2018

Neural Modular Control for Embodied Question Answering

arXiv:1810.11181v2145 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient and interpretable navigation for AI agents in realistic indoor environments, representing an incremental advance in embodied AI.

The paper tackles the problem of learning navigation policies from language input over long planning horizons in embodied question answering, by introducing a modular hierarchical approach with compositional subgoals, and achieves significant improvements over prior work on the EQA benchmark in House3D.

We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. 'exit room', 'find kitchen', 'find refrigerator', etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies. On the challenging EQA (Das et al., 2018) benchmark in House3D (Wu et al., 2018), requiring navigating diverse realistic indoor environments, our approach outperforms prior work by a significant margin, both in terms of navigation and question answering.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes