CLNov 5, 2025

Context informs pragmatic interpretation in vision-language models

Alvin Wei Ming Tan, Ben Prystawski, Veronica Boyce, Michael C. Frank

arXiv:2511.03908v12.7h-index: 3

Originality Incremental advance

AI Analysis

This addresses the problem of context-sensitive pragmatic reasoning in multi-turn linguistic environments for vision-language models, showing incremental improvements with context.

The study tested vision-language models on iterated reference games, finding that without relevant context, models performed above chance but worse than humans, but with relevant context, model performance increased dramatically over trials.

Iterated reference games - in which players repeatedly pick out novel referents using language - present a test case for agents' ability to perform context-sensitive pragmatic reasoning in multi-turn linguistic environments. We tested humans and vision-language models on trials from iterated reference games, varying the given context in terms of amount, order, and relevance. Without relevant context, models were above chance but substantially worse than humans. However, with relevant context, model performance increased dramatically over trials. Few-shot reference games with abstract referents remain a difficult task for machine learning models.

View on arXiv PDF

Similar