Pragmatic inference and visual abstraction enable contextual flexibility during visual communication
This work addresses the problem of understanding contextual flexibility in visual communication for cognitive science and AI, providing an algorithmically explicit theory, but it is incremental as it builds on existing concepts of abstraction and inference.
The study tackled how people adapt drawings for communication by testing participants in a drawing-based reference game, finding that on 'far' trials with different object categories, sketchers achieved high recognition accuracy (concrete numbers not provided) while using fewer strokes, less ink, and less time compared to 'close' trials. They developed a computational model combining visual abstraction and pragmatic inference that fit human data well and outperformed lesioned variants.
Visual modes of communication are ubiquitous in modern life --- from maps to data plots to political cartoons. Here we investigate drawing, the most basic form of visual communication. Participants were paired in an online environment to play a drawing-based reference game. On each trial, both participants were shown the same four objects, but in different locations. The sketcher's goal was to draw one of these objects so that the viewer could select it from the array. On `close' trials, objects belonged to the same basic-level category, whereas on `far' trials objects belonged to different categories. We found that people exploited shared information to efficiently communicate about the target object: on far trials, sketchers achieved high recognition accuracy while applying fewer strokes, using less ink, and spending less time on their drawings than on close trials. We hypothesized that humans succeed in this task by recruiting two core faculties: visual abstraction, the ability to perceive the correspondence between an object and a drawing of it; and pragmatic inference, the ability to judge what information would help a viewer distinguish the target from distractors. To evaluate this hypothesis, we developed a computational model of the sketcher that embodied both faculties, instantiated as a deep convolutional neural network nested within a probabilistic program. We found that this model fit human data well and outperformed lesioned variants. Together, this work provides the first algorithmically explicit theory of how visual perception and social cognition jointly support contextual flexibility in visual communication.