CLCVLGMLJul 7, 2020

What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets

arXiv:2007.03626v114 citations
Originality Incremental advance
AI Analysis

This addresses biases that mislead multimodal models and hinder generalization in video QA, offering insights for dataset design and model debugging, though it is incremental as it builds on existing bias analysis work.

The paper analyzed question answering biases in video QA datasets, finding that pretrained language models could answer 37-48% of questions correctly without video context, far exceeding the 20% random baseline, and identified annotators and question types as sources of bias.

Question answering biases in video QA datasets can mislead multimodal model to overfit to QA artifacts and jeopardize the model's ability to generalize. Understanding how strong these QA biases are and where they come from helps the community measure progress more accurately and provide researchers insights to debug their models. In this paper, we analyze QA biases in popular video question answering datasets and discover pretrained language models can answer 37-48% questions correctly without using any multimodal context information, far exceeding the 20% random guess baseline for 5-choose-1 multiple-choice questions. Our ablation study shows biases can come from annotators and type of questions. Specifically, annotators that have been seen during training are better predicted by the model and reasoning, abstract questions incur more biases than factual, direct questions. We also show empirically that using annotator-non-overlapping train-test splits can reduce QA biases for video QA datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes