Think Visually: Question Answering through Virtual Imagery
This work addresses geometric reasoning for question-answering systems, but it appears incremental as it builds on existing visual representation methods.
The paper tackled geometric reasoning in question-answering by introducing the Dynamic Spatial Memory Network (DSMN) to generate and reason over latent visual representations, and proposed two synthetic benchmarks, FloorPlanQA and ShapeIntersection, to evaluate such systems, with experimental results validating DSMN's effectiveness.
In this paper, we study the problem of geometric reasoning in the context of question-answering. We introduce Dynamic Spatial Memory Network (DSMN), a new deep network architecture designed for answering questions that admit latent visual representations. DSMN learns to generate and reason over such representations. Further, we propose two synthetic benchmarks, FloorPlanQA and ShapeIntersection, to evaluate the geometric reasoning capability of QA systems. Experimental results validate the effectiveness of our proposed DSMN for visual thinking tasks.