CLCVLGJan 31, 2021

An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games

arXiv:2102.00424v1802 citations
Originality Incremental advance
AI Analysis

This addresses the problem of improving generalization in AI agents for vision-language tasks, though it is incremental by building on existing guessing game paradigms.

The study investigated how playing visual guessing games improves an agent's generalization to novel NLP tasks like Visual Question Answering, finding that their Self-play via Iterated Experience Learning method increased accuracy by +7.79 on CompGuessWhat?! and by +5.31 on TDIUC VQA.

Guessing games are a prototypical instance of the "learning by interacting" paradigm. This work investigates how well an artificial agent can benefit from playing guessing games when later asked to perform on novel NLP downstream tasks such as Visual Question Answering (VQA). We propose two ways to exploit playing guessing games: 1) a supervised learning scenario in which the agent learns to mimic successful guessing games and 2) a novel way for an agent to play by itself, called Self-play via Iterated Experience Learning (SPIEL). We evaluate the ability of both procedures to generalize: an in-domain evaluation shows an increased accuracy (+7.79) compared with competitors on the evaluation suite CompGuessWhat?!; a transfer evaluation shows improved performance for VQA on the TDIUC dataset in terms of harmonic average accuracy (+5.31) thanks to more fine-grained object representations learned via SPIEL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes