CVAug 18, 2025

Checkmate: interpretable and explainable RSVQA is the endgame

arXiv:2508.13086v11 citationsh-index: 66
Originality Incremental advance
AI Analysis

This work addresses interpretability and bias issues in RSVQA for remote sensing applications, representing an incremental advance with a new dataset and model.

The paper tackled the problem of lack of interpretability and explainability in Remote Sensing Visual Question Answering (RSVQA) by introducing a novel dataset called Chessboard with 3,123,253 questions and a balanced answer distribution, and developing an explainable model called Checkmate that identifies relevant image cells, improving transparency and trustworthiness in RSVQA systems.

Remote Sensing Visual Question Answering (RSVQA) presents unique challenges in ensuring that model decisions are both understandable and grounded in visual content. Current models often suffer from a lack of interpretability and explainability, as well as from biases in dataset distributions that lead to shortcut learning. In this work, we tackle these issues by introducing a novel RSVQA dataset, Chessboard, designed to minimize biases through 3'123'253 questions and a balanced answer distribution. Each answer is linked to one or more cells within the image, enabling fine-grained visual reasoning. Building on this dataset, we develop an explainable and interpretable model called Checkmate that identifies the image cells most relevant to its decisions. Through extensive experiments across multiple model architectures, we show that our approach improves transparency and supports more trustworthy decision-making in RSVQA systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes