CVAICLROJul 14, 2020

Explore and Explain: Self-supervised Navigation and Recounting

arXiv:2007.07268v120 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of autonomous agents in embodied AI for tasks requiring simultaneous exploration and description, though it appears incremental by combining existing techniques in a new setting.

The paper tackles the problem of an embodied agent exploring an unknown environment while generating natural language descriptions of what it sees, integrating a self-supervised exploration module with a captioning model. The result is demonstrated through experiments on the Matterport3D dataset, evaluating navigation and explanation capabilities.

Embodied AI has been recently gaining attention as it aims to foster the development of autonomous and intelligent agents. In this paper, we devise a novel embodied setting in which an agent needs to explore a previously unknown environment while recounting what it sees during the path. In this context, the agent needs to navigate the environment driven by an exploration goal, select proper moments for description, and output natural language descriptions of relevant objects and scenes. Our model integrates a novel self-supervised exploration module with penalty, and a fully-attentive captioning model for explanation. Also, we investigate different policies for selecting proper moments for explanation, driven by information coming from both the environment and the navigation. Experiments are conducted on photorealistic environments from the Matterport3D dataset and investigate the navigation and explanation capabilities of the agent as well as the role of their interactions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes