CLAINov 12, 2023

Learning Globally Optimized Language Structure via Adversarial Training

arXiv:2311.06771v11 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem in natural language processing for researchers and practitioners, offering an incremental improvement in training discrete energy-based models.

The paper tackles the challenge of learning effective energy-based models for text due to its discrete nature by proposing an adversarial training strategy, resulting in substantially enhanced quality of generated sequences on an arithmetic sequence generation task compared to prior methods.

Recent work has explored integrating autoregressive language models with energy-based models (EBMs) to enhance text generation capabilities. However, learning effective EBMs for text is challenged by the discrete nature of language. This work proposes an adversarial training strategy to address limitations in prior efforts. Specifically, an iterative adversarial attack algorithm is presented to generate negative samples for training the EBM by perturbing text from the autoregressive model. This aims to enable the EBM to suppress spurious modes outside the support of the data distribution. Experiments on an arithmetic sequence generation task demonstrate that the proposed adversarial training approach can substantially enhance the quality of generated sequences compared to prior methods. The results highlight the promise of adversarial techniques to improve discrete EBM training. Key contributions include: (1) an adversarial attack strategy tailored to text to generate negative samples, circumventing MCMC limitations; (2) an adversarial training algorithm for EBMs leveraging these attacks; (3) empirical validation of performance improvements on a sequence generation task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes