CV AI CL LGNov 30, 2017

Embodied Question Answering

Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

arXiv:1711.11543v242.2772 citations

Originality Highly original

AI Analysis

This work addresses the problem of integrating active perception, language understanding, and goal-driven navigation for AI agents, which is significant for researchers working on embodied AI and robotics.

This paper introduces Embodied Question Answering (EmbodiedQA), a new AI task where an agent navigates a 3D environment using first-person vision to answer questions about objects within it. The authors developed the necessary environments, end-to-end reinforcement learning agents, and evaluation protocols for this task.

We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?"). In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question ("orange"). This challenging task requires a range of AI skills -- active perception, language understanding, goal-driven navigation, commonsense reasoning, and grounding of language into actions. In this work, we develop the environments, end-to-end-trained reinforcement learning agents, and evaluation protocols for EmbodiedQA.

View on arXiv PDF

Similar