CLLGJun 7, 2021

Lexicon Learning for Few-Shot Neural Sequence Modeling

arXiv:2106.03993v133 citations
Originality Highly original
AI Analysis

This addresses the issue of systematic generalization failures in neural models for language processing applications like semantic parsing and machine translation, particularly in few-shot scenarios.

The paper tackles the problem of neural sequence models' brittleness in low-resource settings by augmenting decoders with a lexical translation mechanism, showing improved systematic generalization across tasks from cognitive science, formal semantics, and machine translation.

Sequence-to-sequence transduction is the core problem in language processing applications as diverse as semantic parsing, machine translation, and instruction following. The neural network models that provide the dominant solution to these problems are brittle, especially in low-resource settings: they fail to generalize correctly or systematically from small datasets. Past work has shown that many failures of systematic generalization arise from neural models' inability to disentangle lexical phenomena from syntactic ones. To address this, we augment neural decoders with a lexical translation mechanism that generalizes existing copy mechanisms to incorporate learned, decontextualized, token-level translation rules. We describe how to initialize this mechanism using a variety of lexicon learning algorithms, and show that it improves systematic generalization on a diverse set of sequence modeling tasks drawn from cognitive science, formal semantics, and machine translation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes