CLAILGJun 28, 2016

"Show me the cup": Reference with Continuous Representations

arXiv:1606.08777v17 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of individuation in language reference for AI systems, but it is incremental as it builds on existing tasks and methods.

The paper tackles the problem of modeling reference to objects in a scene using continuous representations, introducing a neural network that points to intended objects based on descriptions and is competitive with a manually engineered pipeline.

One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects represented by natural images, points to the intended object if the expression has a unique referent, or indicates a failure, if it does not. The model, directly trained on reference acts, is competitive with a pipeline manually engineered to perform the same task, both when referents are purely visual, and when they are characterized by a combination of visual and linguistic properties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes