CLJun 13, 2019

Know What You Don't Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories

arXiv:1906.05518v11094 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating informative object descriptions in language and vision tasks for scenarios involving unknown categories, representing an incremental advancement by combining zero-shot learning with pragmatic reasoning.

The paper tackles the problem of referring to novel objects in zero-shot reference games by modeling a pragmatic speaker that reasons about uncertain object categories, resulting in improved communicative success with fewer nouns and distractor category names compared to a literal speaker.

Zero-shot learning in Language & Vision is the task of correctly labelling (or naming) objects of novel categories. Another strand of work in L&V aims at pragmatically informative rather than ``correct'' object descriptions, e.g. in reference games. We combine these lines of research and model zero-shot reference games, where a speaker needs to successfully refer to a novel object in an image. Inspired by models of "rational speech acts", we extend a neural generator to become a pragmatic speaker reasoning about uncertain object categories. As a result of this reasoning, the generator produces fewer nouns and names of distractor categories as compared to a literal speaker. We show that this conversational strategy for dealing with novel objects often improves communicative success, in terms of resolution accuracy of an automatic listener.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes