FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering
This work addresses dataset limitations in visual question answering for researchers, but it is incremental as it builds upon an existing dataset.
The authors tackled the imbalance and concentration issues in the Fact-based Visual Question Answering (FVQA) dataset by introducing FVQA 2.0, which includes adversarial test samples, and they demonstrated that systems trained on the original data are vulnerable to these samples but can be made more robust through an augmentation scheme without human annotations.
The widely used Fact-based Visual Question Answering (FVQA) dataset contains visually-grounded questions that require information retrieval using common sense knowledge graphs to answer. It has been observed that the original dataset is highly imbalanced and concentrated on a small portion of its associated knowledge graph. We introduce FVQA 2.0 which contains adversarial variants of test questions to address this imbalance. We show that systems trained with the original FVQA train sets can be vulnerable to adversarial samples and we demonstrate an augmentation scheme to reduce this vulnerability without human annotations.