Explicit Knowledge-based Reasoning for Visual Question Answering
This addresses the problem of general visual question answering for AI systems by introducing a dataset and evaluation protocol, though it is incremental in combining knowledge bases with existing methods.
The paper tackles visual question answering by integrating a large-scale knowledge base to reason about image contents, enabling answers to complex questions beyond image concepts and providing explanations, and it significantly outperforms the predominant LSTM-based approach in testing.
We describe a method for visual question answering which is capable of reasoning about contents of an image on the basis of information extracted from a large-scale knowledge base. The method not only answers natural language questions using concepts not contained in the image, but can provide an explanation of the reasoning by which it developed its answer. The method is capable of answering far more complex questions than the predominant long short-term memory-based approach, and outperforms it significantly in the testing. We also provide a dataset and a protocol by which to evaluate such methods, thus addressing one of the key issues in general visual ques- tion answering.