HCAICLCVLGSep 4, 2019

Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning

arXiv:1909.01871v61063 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge of interactive navigation for agents in complex environments, though it is incremental as it builds on existing imitation learning and multimodal assistance frameworks.

The paper tackles the problem of enabling mobile agents to complete object-finding tasks by requesting and interpreting natural language and visual assistance from simulated human helpers, resulting in a higher task success rate compared to baselines in both seen and unseen environments.

Mobile agents that can leverage help from humans can potentially accomplish more complex tasks than they could entirely on their own. We develop "Help, Anna!" (HANNA), an interactive photo-realistic simulator in which an agent fulfills object-finding tasks by requesting and interpreting natural language-and-vision assistance. An agent solving tasks in a HANNA environment can leverage simulated human assistants, called ANNA (Automatic Natural Navigation Assistants), which, upon request, provide natural language and visual instructions to direct the agent towards the goals. To address the HANNA problem, we develop a memory-augmented neural agent that hierarchically models multiple levels of decision-making, and an imitation learning algorithm that teaches the agent to avoid repeating past mistakes while simultaneously predicting its own chances of making future progress. Empirically, our approach is able to ask for help more effectively than competitive baselines and, thus, attains higher task success rate on both previously seen and previously unseen environments. We publicly release code and data at https://github.com/khanhptnk/hanna . A video demo is available at https://youtu.be/18P94aaaLKg .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes