CLAILGJan 31, 2018

Interactive Grounded Language Acquisition and Generalization in a 2D World

arXiv:1802.01433v480 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of grounded language learning for AI agents, enabling generalization to new word combinations, though it is incremental in its approach.

The paper tackles the problem of interactive language acquisition in a 2D world, where an agent learns language from scratch for navigation and question answering, achieving significant outperformance over five comparison methods in interpreting zero-shot sentences.

We build a virtual agent for learning language in a 2D maze-like world. The agent sees images of the surrounding environment, listens to a virtual teacher, and takes actions to receive rewards. It interactively learns the teacher's language from scratch based on two language use cases: sentence-directed navigation and question answering. It learns simultaneously the visual representations of the world, the language, and the action control. By disentangling language grounding from other computational routines and sharing a concept detection function between language grounding and prediction, the agent reliably interpolates and extrapolates to interpret sentences that contain new word combinations or new words missing from training sentences. The new words are transferred from the answers of language prediction. Such a language ability is trained and evaluated on a population of over 1.6 million distinct sentences consisting of 119 object words, 8 color words, 9 spatial-relation words, and 50 grammatical words. The proposed model significantly outperforms five comparison methods for interpreting zero-shot sentences. In addition, we demonstrate human-interpretable intermediate outputs of the model in the appendix.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes