CVAISep 4, 2018

Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering

arXiv:1809.01124v1117 citations
AI Analysis

This addresses the issue of combining observed content with general knowledge for autonomous agents and virtual assistants, though it is incremental as it builds on existing datasets and methods.

The paper tackles the problem of factual visual question answering by developing a learning-based approach that retrieves knowledge from a knowledge base via a learned embedding space, achieving state-of-the-art results with over 5% improvement on a challenging dataset.

Question answering is an important task for autonomous agents and virtual assistants alike and was shown to support the disabled in efficiently navigating an overwhelming environment. Many existing methods focus on observation-based questions, ignoring our ability to seamlessly combine observed content with general knowledge. To understand interactions with a knowledge base, a dataset has been introduced recently and keyword matching techniques were shown to yield compelling results despite being vulnerable to misconceptions due to synonyms and homographs. To address this issue, we develop a learning-based approach which goes straight to the facts via a learned embedding space. We demonstrate state-of-the-art results on the challenging recently introduced fact-based visual question answering dataset, outperforming competing methods by more than 5%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes