Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding
This work addresses the challenge of grounded language understanding for AI systems, particularly in reference tasks, but it is incremental as it builds on existing neural and pragmatic approaches.
The authors tackled the problem of interpreting color descriptions in grounded communication by developing a pragmatic neural model that combines speaker and listener classifiers within a recursive reasoning framework. The model achieved higher accuracy in interpreting color descriptions than its component classifiers, with significant improvements in challenging cases like distinguishing similar colors or when few utterances were available.
We present a model of pragmatic referring expression interpretation in a grounded communication task (identifying colors from descriptions) that draws upon predictions from two recurrent neural network classifiers, a speaker and a listener, unified by a recursive pragmatic reasoning framework. Experiments show that this combined pragmatic model interprets color descriptions more accurately than the classifiers from which it is built, and that much of this improvement results from combining the speaker and listener perspectives. We observe that pragmatic reasoning helps primarily in the hardest cases: when the model must distinguish very similar colors, or when few utterances adequately express the target color. Our findings make use of a newly-collected corpus of human utterances in color reference games, which exhibit a variety of pragmatic behaviors. We also show that the embedded speaker model reproduces many of these pragmatic behaviors.