CLMar 14, 2022

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

MILA
arXiv:2203.07402v1648 citationsh-index: 59
Originality Synthesis-oriented
AI Analysis

This work addresses the issue of underestimating compositional generalization abilities in AI models, which is important for researchers in natural language processing and machine learning, though it is incremental as it builds on existing benchmarks.

The paper tackled the problem of compositional generalization in neural sequence models, showing that modifying the training distribution enables standard seq-to-seq models to achieve near-perfect generalization performance on the SCAN benchmark.

Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionally generalize. In this paper, we focus on one-shot primitive generalization as introduced by the popular SCAN benchmark. We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance, thereby showing that their compositional generalization abilities were previously underestimated. We perform detailed empirical analysis of this phenomenon. Our results indicate that the generalization performance of models is highly sensitive to the characteristics of the training data which should be carefully considered while designing such benchmarks in future.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes