CLDec 14, 2021

Improving Compositional Generalization with Latent Structure and Data Augmentation

arXiv:2112.07610v2630 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of out-of-distribution generalization for semantic parsing tasks, offering a more effective solution than prior methods, though it builds incrementally on existing data augmentation and model techniques.

The paper tackled the problem of compositional generalization in neural networks by introducing a data recombination method using a generative model with a quasi-synchronous context-free grammar, which improved state-of-the-art performance on two real-world semantic parsing tasks.

Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to such black-box neural models for several semantic parsing tasks, but this often required task-specific engineering or provided limited gains. We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL). CSL is a generative model with a quasi-synchronous context-free grammar backbone, which we induce from the training data. We sample recombined examples from CSL and add them to the fine-tuning data of a pre-trained sequence-to-sequence model (T5). This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks, and results in a model even stronger than a T5-CSL ensemble on two real world compositional generalization tasks. This results in new state-of-the-art performance for these challenging semantic parsing tasks requiring generalization to both natural language variation and novel compositions of elements.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes