CLCVMar 19, 2023

FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering

arXiv:2303.10699v1269 citationsh-index: 26
Originality Synthesis-oriented
AI Analysis

This work addresses dataset limitations in visual question answering for researchers, but it is incremental as it builds upon an existing dataset.

The authors tackled the imbalance and concentration issues in the Fact-based Visual Question Answering (FVQA) dataset by introducing FVQA 2.0, which includes adversarial test samples, and they demonstrated that systems trained on the original data are vulnerable to these samples but can be made more robust through an augmentation scheme without human annotations.

The widely used Fact-based Visual Question Answering (FVQA) dataset contains visually-grounded questions that require information retrieval using common sense knowledge graphs to answer. It has been observed that the original dataset is highly imbalanced and concentrated on a small portion of its associated knowledge graph. We introduce FVQA 2.0 which contains adversarial variants of test questions to address this imbalance. We show that systems trained with the original FVQA train sets can be vulnerable to adversarial samples and we demonstrate an augmentation scheme to reduce this vulnerability without human annotations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes