CV AI CL LGNov 12, 2018

Blindfold Baselines for Embodied QA

Ankesh Anand, Eugene Belilovsky, Kyle Kastner, Hugo Larochelle, Aaron Courville

arXiv:1811.05013v117.946 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work reveals a critical flaw in the EQA task design, showing that current benchmarks may be trivial for simple methods, which is a problem for researchers aiming to develop robust embodied AI systems.

The paper tackled the Embodied Question Answering (EQA) task by proposing a blindfold (question-only) baseline that ignores visual information, and found it achieves state-of-the-art results on the EQAv1 dataset in most cases, except when the agent starts very close to the object.

We explore blindfold (question-only) baselines for Embodied Question Answering. The EmbodiedQA task requires an agent to answer a question by intelligently navigating in a simulated environment, gathering necessary visual information only through first-person vision before finally answering. Consequently, a blindfold baseline which ignores the environment and visual information is a degenerate solution, yet we show through our experiments on the EQAv1 dataset that a simple question-only baseline achieves state-of-the-art results on the EmbodiedQA task in all cases except when the agent is spawned extremely close to the object.

View on arXiv PDF Code

Similar