CVAICLLGNov 12, 2018

Blindfold Baselines for Embodied QA

arXiv:1811.05013v146 citations
Originality Synthesis-oriented
AI Analysis

This work reveals a critical flaw in the EQA task design, showing that current benchmarks may be trivial for simple methods, which is a problem for researchers aiming to develop robust embodied AI systems.

The paper tackled the Embodied Question Answering (EQA) task by proposing a blindfold (question-only) baseline that ignores visual information, and found it achieves state-of-the-art results on the EQAv1 dataset in most cases, except when the agent starts very close to the object.

We explore blindfold (question-only) baselines for Embodied Question Answering. The EmbodiedQA task requires an agent to answer a question by intelligently navigating in a simulated environment, gathering necessary visual information only through first-person vision before finally answering. Consequently, a blindfold baseline which ignores the environment and visual information is a degenerate solution, yet we show through our experiments on the EQAv1 dataset that a simple question-only baseline achieves state-of-the-art results on the EmbodiedQA task in all cases except when the agent is spawned extremely close to the object.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes