CLDec 14, 2021

Improving Compositional Generalization with Latent Structure and Data Augmentation

Linlu Qiu, Peter Shaw, Panupong Pasupat, Paweł Krzysztof Nowak, Tal Linzen, Fei Sha, Kristina Toutanova

arXiv:2112.07610v230.5630 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of out-of-distribution generalization for semantic parsing tasks, offering a more effective solution than prior methods, though it builds incrementally on existing data augmentation and model techniques.

The paper tackled the problem of compositional generalization in neural networks by introducing a data recombination method using a generative model with a quasi-synchronous context-free grammar, which improved state-of-the-art performance on two real-world semantic parsing tasks.

Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to such black-box neural models for several semantic parsing tasks, but this often required task-specific engineering or provided limited gains. We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL). CSL is a generative model with a quasi-synchronous context-free grammar backbone, which we induce from the training data. We sample recombined examples from CSL and add them to the fine-tuning data of a pre-trained sequence-to-sequence model (T5). This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks, and results in a model even stronger than a T5-CSL ensemble on two real world compositional generalization tasks. This results in new state-of-the-art performance for these challenging semantic parsing tasks requiring generalization to both natural language variation and novel compositions of elements.

View on arXiv PDF Code

Similar