CVJan 23, 2023

HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images

arXiv:2301.09460v126 citationsh-index: 60
Originality Incremental advance
AI Analysis

This work addresses the problem of limited datasets for VQA in aerial imagery, which is important for applications like disaster monitoring and urban planning, but it is incremental as it builds on existing VQA frameworks.

The authors tackled the challenge of visual question answering (VQA) for high-resolution aerial images by introducing a new dataset, HRVQA, with 53,512 images and 1,070,240 QA pairs, and proposed a model, GFTransformer, that achieved superior performance compared to previous state-of-the-art methods.

Visual question answering (VQA) is an important and challenging multimodal task in computer vision. Recently, a few efforts have been made to bring VQA task to aerial images, due to its potential real-world applications in disaster monitoring, urban planning, and digital earth product generation. However, not only the huge variation in the appearance, scale and orientation of the concepts in aerial images, but also the scarcity of the well-annotated datasets restricts the development of VQA in this domain. In this paper, we introduce a new dataset, HRVQA, which provides collected 53512 aerial images of 1024*1024 pixels and semi-automatically generated 1070240 QA pairs. To benchmark the understanding capability of VQA models for aerial images, we evaluate the relevant methods on HRVQA. Moreover, we propose a novel model, GFTransformer, with gated attention modules and a mutual fusion module. The experiments show that the proposed dataset is quite challenging, especially the specific attribute related questions. Our method achieves superior performance in comparison to the previous state-of-the-art approaches. The dataset and the source code will be released at https://hrvqa.nl/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes