CLAIMar 16, 2020

A Formal Analysis of Multimodal Referring Strategies Under Common Ground

arXiv:2003.07385v1998 citations
AI Analysis

This work addresses the problem of improving multimodal communication models for AI systems interacting with humans, though it appears incremental in nature.

The paper analyzes computationally generated multimodal referring expressions combining gesture and language, revealing formal semantic properties of their interaction conditioned on common ground, and shows these features can improve model predictions of viewer judgments and potentially enhance expression generation.

In this paper, we present an analysis of computationally generated mixed-modality definite referring expressions using combinations of gesture and linguistic descriptions. In doing so, we expose some striking formal semantic properties of the interactions between gesture and language, conditioned on the introduction of content into the common ground between the (computational) speaker and (human) viewer, and demonstrate how these formal features can contribute to training better models to predict viewer judgment of referring expressions, and potentially to the generation of more natural and informative referring expressions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes