CL AIMay 31, 2023

Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency

Maxwell Crouse, Ramon Astudillo, Tahira Naseem, Subhajit Chaudhury, Pavan Kapanipathi, Salim Roukos, Alexander Gray

arXiv:2305.20018v10.5

Originality Incremental advance

AI Analysis

This work addresses the challenge of data scarcity in semantic parsing for natural language processing, offering an incremental improvement over existing methods.

The authors tackled the problem of training neural semantic parsers with limited labeled data by introducing LOCCO, a scalable semi-supervised method that uses self-learning with a count-based prior and cycle-consistency, resulting in a 2-point improvement over a self-learning parser and a 1.3-point gain over the previous state-of-the-art on the WebNLG benchmark.

We introduce Logical Offline Cycle Consistency Optimization (LOCCO), a scalable, semi-supervised method for training a neural semantic parser. Conceptually, LOCCO can be viewed as a form of self-learning where the semantic parser being trained is used to generate annotations for unlabeled text that are then used as new supervision. To increase the quality of annotations, our method utilizes a count-based prior over valid formal meaning representations and a cycle-consistency score produced by a neural text generation model as additional signals. Both the prior and semantic parser are updated in an alternate fashion from full passes over the training data, which can be seen as approximating the marginalization of latent structures through stochastic variational inference. The use of a count-based prior, frozen text generation model, and offline annotation process yields an approach with negligible complexity and latency increases as compared to conventional self-learning. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model. We demonstrate the utility of LOCCO on the well-known WebNLG benchmark where we obtain an improvement of 2 points against a self-learning parser under equivalent conditions, an improvement of 1.3 points against the previous state-of-the-art parser, and competitive text generation performance in terms of BLEU score.

View on arXiv PDF

Similar