CVJun 1, 2023

Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

arXiv:2306.00483v15 citationsh-index: 50
Originality Incremental advance
AI Analysis

This addresses a specific problem of language bias for researchers and practitioners in remote sensing VQA, representing an incremental improvement.

The paper tackles language bias in remote sensing visual question answering (RSVQA) by introducing an adversarial training framework with regularizers, resulting in improved performance as demonstrated by a new evaluation metric.

The Visual Question Answering (VQA) system offers a user-friendly interface and enables human-computer interaction. However, VQA models commonly face the challenge of language bias, resulting from the learned superficial correlation between questions and answers. To address this issue, in this study, we present a novel framework to reduce the language bias of the VQA for remote sensing data (RSVQA). Specifically, we add an adversarial branch to the original VQA framework. Based on the adversarial branch, we introduce two regularizers to constrain the training process against language bias. Furthermore, to evaluate the performance in terms of language bias, we propose a new metric that combines standard accuracy with the performance drop when incorporating question and random image information. Experimental results demonstrate the effectiveness of our method. We believe that our method can shed light on future work for reducing language bias on the RSVQA task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes