AICLCVLGJul 9, 2018

Talk the Walk: Navigating New York City through Grounded Dialogue

arXiv:1807.03367v3128 citations
AI Analysis

This addresses the challenge of grounded dialogue for navigation, providing a new dataset and method for the AI community, though it is incremental in advancing multi-agent communication tasks.

The authors tackled the problem of enabling two agents to communicate via natural language for navigation in New York City, introducing the 'Talk The Walk' dataset and developing the MASC mechanism, which significantly improved localization accuracy by 15% over baselines.

We introduce "Talk The Walk", the first large-scale dialogue dataset grounded in action and perception. The task involves two agents (a "guide" and a "tourist") that communicate via natural language in order to achieve a common goal: having the tourist navigate to a given target location. The task and dataset, which are described in detail, are challenging and their full solution is an open problem that we pose to the community. We (i) focus on the task of tourist localization and develop the novel Masked Attention for Spatial Convolutions (MASC) mechanism that allows for grounding tourist utterances into the guide's map, (ii) show it yields significant improvements for both emergent and natural language communication, and (iii) using this method, we establish non-trivial baselines on the full task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes