LGCLCVNov 15, 2023

Attribute Diversity Determines the Systematicity Gap in VQA

CMUHarvard
arXiv:2311.08695v325 citationsh-index: 15
Originality Incremental advance
AI Analysis

This addresses the problem of improving compositional generalization in neural networks for VQA, though it is incremental as it builds on existing work on systematicity.

The paper tackled the systematicity gap in visual question answering, showing that increasing attribute diversity in training data reduces the performance difference between seen and unseen attribute combinations, with experiments on the CLEVR-HOPE dataset indicating this effect.

Although modern neural networks often generalize to new combinations of familiar concepts, the conditions that enable such compositionality have long been an open question. In this work, we study the systematicity gap in visual question answering: the performance difference between reasoning on previously seen and unseen combinations of object attributes. To test, we introduce a novel diagnostic dataset, CLEVR-HOPE. We find that the systematicity gap is not reduced by increasing the quantity of training data, but is reduced by increasing the diversity of training data. In particular, our experiments suggest that the more distinct attribute type combinations are seen during training, the more systematic we can expect the resulting model to be.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes