CVMay 24, 2019

Deep Reason: A Strong Baseline for Real-World Visual Reasoning

arXiv:1905.10226v23 citations
Originality Synthesis-oriented
AI Analysis

This work provides a baseline for researchers in visual reasoning, but it is incremental as it builds on existing methods without introducing major innovations.

The paper tackled the problem of real-world visual reasoning on the GQA dataset by developing a strong baseline that achieved 60.93% accuracy, securing sixth place in the 2019 challenge.

This paper presents a strong baseline for real-world visual reasoning (GQA), which achieves 60.93% in GQA 2019 challenge and won the sixth place. GQA is a large dataset with 22M questions involving spatial understanding and multi-step inference. To help further research in this area, we identified three crucial parts that improve the performance, namely: multi-source features, fine-grained encoder, and score-weighted ensemble. We provide a series of analysis on their impact on performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes