CLLGOct 7, 2019

Compositional Generalization for Primitive Substitutions

arXiv:1910.02612v11039 citations
Originality Highly original
AI Analysis

This addresses the challenge of enabling neural networks to generalize compositionally in NLP tasks, representing a strong specific gain rather than a broad paradigm shift.

The paper tackles the problem of compositional generalization in neural networks, which is lacking compared to human language learning, by introducing a method using two representations to encode compositionality, resulting in significant accuracy improvements from 14.0% to 98.8% in the SCAN Jump task and from 92.0% to 99.7% in the TurnLeft task.

Compositional generalization is a basic mechanism in human language learning, but current neural networks lack such ability. In this paper, we conduct fundamental research for encoding compositionality in neural networks. Conventional methods use a single representation for the input sentence, making it hard to apply prior knowledge of compositionality. In contrast, our approach leverages such knowledge with two representations, one generating attention maps, and the other mapping attended input words to output symbols. We reduce the entropy in each representation to improve generalization. Our experiments demonstrate significant improvements over the conventional methods in five NLP tasks including instruction learning and machine translation. In the SCAN domain, it boosts accuracies from 14.0% to 98.8% in Jump task, and from 92.0% to 99.7% in TurnLeft task. It also beats human performance on a few-shot learning task. We hope the proposed approach can help ease future research towards human-level compositional language learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes